Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File timestamp 0 breaks Spring WebMVCs static resource resolving #1079

Closed
mbonato opened this issue Oct 1, 2018 · 32 comments
Closed

File timestamp 0 breaks Spring WebMVCs static resource resolving #1079

mbonato opened this issue Oct 1, 2018 · 32 comments

Comments

@mbonato
Copy link

mbonato commented Oct 1, 2018

To create reproducible builds Jib sets a modTime of 0 for every file it includes in the Docker image layers

This unfortunately breaks resolving static resources in Spring WebApplications. One might argue, that this is bug in Spring WebMVC and in fact I've also filed a bug there (https://jira.spring.io/browse/SPR-17320) - however, the java.io.File.lastModified() uses 0L also to indicate non-existing files and I/O errors.

In Java 6 and below (https://docs.oracle.com/javase/6/docs/api/java/io/File.html#lastModified()), 0L did not seem to be considered as a valid value. In later versions the JavaDoc also talks about 0L as a possible non-error value. "Where it is required to distinguish an I/O exception from the case where 0L is returned, or where several attributes of the same file are required at the same time, or where the time of last access or the creation time are required, then the Files.readAttributes method may be used." (https://docs.oracle.com/javase/7/docs/api/java/io/File.html#lastModified()).

Considering the contract of the java.io.File.lastModified() method, IMHO the choice of 0L for the modification date is quite unfortunate. Any fixed value other than 0 would also ensure reproducible builds but avoid the problematic error handling of java.io.File.lastModified() returning 0L also for non-existing files and I/O errors. Therefore, I would suggest to either set an unproblematic fixed value of e.g. 1L or make it configurable (with a default value other then 0L).

mbonato pushed a commit to primesign/jib that referenced this issue Oct 1, 2018
@loosebazooka
Copy link
Member

@mattmoor do you use 0 in bazel builds? I have a vague memory that it used some other date.

@chanseokoh
Copy link
Member

chanseokoh commented Oct 1, 2018

IMO, I don't find anything wrong with using the value of 1, considering the contract of lastModified(). Just a 1 milisec difference, if 0 were a valid value.

@loosebazooka loosebazooka added this to the v0.10.0 milestone Oct 1, 2018
@ST-DDT
Copy link

ST-DDT commented Oct 2, 2018

I would appreciate a configurable timestamp. Because that way I won't be restricted with my choice of "x' is newer than x" http checks. I would then use the timestamp of the last commit to build that image.

@chanseokoh
Copy link
Member

We discussed that it makes sense to also update the container build timestamp.

@loosebazooka
Copy link
Member

@ST-DDT can you elaborate a little?

@ST-DDT
Copy link

ST-DDT commented Oct 2, 2018

During the start of the application, I read some config/files from my app and publish it in a cache, but only if it is newer than the old. If during my rolling update an old container crashes unexpectedly it might restart after the last instance of my new version, thus I cannot rely on the config hash alone. Usually there would be build metadata in my jar, but jib does not build a jar, thats why I have to get the date somewhere else. (and I don't want to manually update it)

TLDR: I don't want to be restricted in my application code, just because I use jib. (Currently a lot of meta-data are missing, and at least I want a way to fake them back in)

See also: #905

@coollog
Copy link
Contributor

coollog commented Oct 2, 2018

@ST-DDT Do you happen to be referring to the timestamp of the container or the timestamp of the files in the container (to be made configurable)?

@nobecutan
Copy link

I think the timestamp of the container and the files within could easily be the same. Either a fixed non-zero value (probably better 1000 than 1, since it is [ms] since 1970), or a configureable timestamp like a last commit timestamp. In both cases you would uphold the reproducible builds contract (in contrast to the useCurrentTimestamp property).

@coollog
Copy link
Contributor

coollog commented Oct 3, 2018

That makes sense. I think making the timestamp configurable in this case would be the most versatile solution. The proposal is to change the useCurrentTimestamp configuration to be just timestamp and accept a timestamp to use.

Before:

<container>
  <useCurrentTimestamp>true</useCurrentTimestamp>
</container>

After:

<container>
  <timestamp>${maven.build.timestamp}</timestamp>
</container>

The complexity comes with parsing, however. The logic for parsing the timestamp could be:

  1. Parse using SimpleDateFormat with the ${maven.build.timestamp.format} format. (Supports the current time use case)
  2. If failed, then consider as since-epoch time in milliseconds. (Supports the fixed time use case like 1000)

However, this logic can break in the case ${maven.build.timestamp.format} is defined such that it can parse a since-epoch time, causing the time to be incorrectly parsed as a SimpleDateFormat.

Thoughts? @GoogleContainerTools/java-tools

@chanseokoh
Copy link
Member

Some options I think:

  1. We only support SimpleDateFormat. I think it can cover since-epoch too, with some computation. I think anyone who may need to use a very large since-epoch time will better use a readable time format.

  2. Introduce <timestampFormat> for our exclusive usage. May default to since-epoch. If defined, the format will be used for SimpleDateFormat parsing.

  3. The proposed one, as-is. It's the user's responsibility to have ${maven.build.timestamp.format} and the timestampe value correct, and parsing is a fair game.

@coollog
Copy link
Contributor

coollog commented Oct 3, 2018

  1. If we only support SimpleDateFormat, the user would still need a way to specify the format to use for parsing, meaning 1) and 2) would go hand-in-hand.

@chanseokoh
Copy link
Member

I was assuming we use ${maven.build.timestamp.format}, which has a default value AFAIK.

@briandealwis
Copy link
Member

briandealwis commented Oct 3, 2018

I'll propose a slight modification to Chanseok's Option 2: and allow setting ${jib.timestamp.format}. If unset, we use maven.build.timestamp.format. If empty, we treat as milliseconds since epoch.

@TadCordle
Copy link
Contributor

I somewhat like @briandealwis's idea, but it seems strange to be parsing a number in some cases and a date in others. Also, if we use maven.build.timestamp.format, is there a gradle equivalent?

I prefer convention over configuration here as far as timestamp format goes. I think we should just keep a single supported timestamp format to parse (or even try multiple similar ones if the date given isn't valid for a certain format).

@briandealwis
Copy link
Member

And now that I think of it, maven.build.timestamp and maven.build.timestamp.format are meant for output, not as input.

Ideally whatever format we use should be easy to obtain, e.g., from a git commit. So my votes would be:

  1. ISO 8601 since it's easily parsed by humans and has a rigid definition. It can be obtained from git via git log --format='%cI' -n 1
  2. Unix timestamp, though its hard for humans to make sense of. It can be obtained from git via git log --format='%ct' -n 1 — though note this reports the time in seconds, not milliseconds, as does strftime(3)'s %s conversion specification.

@nobecutan
Copy link

nobecutan commented Oct 4, 2018

My preference is ISO 8601 for it’s readability.

@ST-DDT
Copy link

ST-DDT commented Oct 4, 2018

And now that I think of it, maven.build.timestamp and maven.build.timestamp.format are meant for output, not as input.

IMO thats only half correct. Sure its meant as output, but we would use those values again to parse the value.

Maven: maven.build.timestamp = new SimpleDateFormat(maven.build.timestamp.format).format(new Date())
Jib: timestamp = new SimpleDateFormat(maven.build.timestamp.format).parse(maven.build.timestamp)

  • <timestamp> would default to maven.build.timestamp
  • <timestampFormat> would default to maven.build.timestamp.format

Ideally whatever format we use should be easy to obtain, e.g., from a git commit. So my votes would be:

  • ISO 8601 since it's easily parsed by humans and has a rigid definition. It can be obtained from git via git log --format='%cI' -n 1
  • Unix timestamp, though its hard for humans to make sense of. It can be obtained from git via git log --format='%ct' -n 1 — though note this reports the time in seconds, not milliseconds, as does strftime(3)'s %s conversion specification.

Since we don't know where we can obtain the timestamp, we have to allow the input format to be configurable.

My preference is ISO 8601 for it’s readability.

AFAICT it's not meant to be read by humans, but used as input for the plugin to set some file attributes.

@briandealwis
Copy link
Member

Sorry @ST-DDT I was a bit opaque. I meant that m.b.t is intended to be output. People change it to suit their own needs In ways that lose information (in this case, training only hours). The m.b.t is also always set in UTC.

By readability I meant easy validation by humans. I personally have no idea whether 15086509845 seems like the right timestamp to correspond to the last commit...

@loosebazooka
Copy link
Member

loosebazooka commented Oct 11, 2018

As discussed offline, we are going to just set the timestamp to 1000 (1 second) for now (to deal with systems that maybe do not go as fine grained as milliseconds)

@coollog
Copy link
Contributor

coollog commented Oct 15, 2018

This current issue has been fixed via #1133 for release in version 0.10.0, but we will resolve setting an arbitrary timestamp in a future release.

@mbonato
Copy link
Author

mbonato commented Oct 15, 2018

Thank you!

Support for arbitrary timestamps would be great. Especially, because the file last-modified date is also used in the Last-Modified HTTP headers of static resources served by Spring Web-MVC. When last-modified is constant, browser/web caches are not properly invalidated when new versions of static resources are deployed with a new jib built docker image. Therefore, would be great to be able to increase the last-modified date with each new (release) build. This could be accomplished e.g. by setting the last-modified date to the last (Git) commit timestamp.

Direct support for the last Git commit timestamp would be great, but I guess a dependency on a particular version control system is not intended. However, given the popularity of Git it would be great to at least support a timestamp format that could easily be provided when extracting the commit timestamp via shell script or another Maven plugin.

@loosebazooka
Copy link
Member

@chanseokoh I thought you submitted a fix for this?

@chanseokoh
Copy link
Member

The fix is a temporary one to set the file timestamps to the epoch+1 sec. In this issue, we've been discussing various config designs to customize timestamps for both files and the container.

@devinrsmith
Copy link

It might be nice to configure the timestamp with the current git commit's timestamp, if applicable.

@yamass
Copy link

yamass commented Nov 8, 2018

@coollog I am still seeing epoch (0) timestamps in 0.9.13, although it was released after the merge. I might have got something wrong here. You mentioned that the fix is for release 0.10.0. Could you elaborate when the fix will be available?

Also, the CHANGELOG indicates that zero-timestamps cause problems for "applications on Java 6 or below". I think that this is incorrect. Spring throws an exception when the timestamp is 0. This is independent of the Java version Spring runs on. They have changed this behavior only a month ago, so nearly every production spring application is affected. See spring-projects/spring-framework@cf3635b#diff-080763d75126bdb9900ccb3abd85fa0bR174

@TadCordle
Copy link
Contributor

@yamass 0.9.13 was branched directly from the 0.9.12 tag and only contains a quick fix for the docker context generator, so a lot of the changes that came between 0.9.12 and 0.9.13 haven't been released yet.

We're actually just about to start the release process for 0.10.0, so it should be available later today!

@mattmoor
Copy link

mattmoor commented Nov 8, 2018

@loosebazooka There are a couple places in the image with timestamps, and I think historically the answer varied somewhat depending on the context. I believe that where we ended up was roughly using the Unix epoch as the timestamp everywhere.

@coollog
Copy link
Contributor

coollog commented Nov 9, 2018

Version 0.10.0 has been released with the time set to epoch+1!

@chanseokoh
Copy link
Member

From #1079 (comment)

We discussed that it makes sense to also update the container build timestamp.

From #1079 (comment):

During the start of the application, I read some config/files from my app and publish it in a cache, but only if it is newer than the old.

From #1079 (comment):

I think the timestamp of the container and the files within could easily be the same

This issue started with talking about file timestamps, and then there have been discussions around the container build timestamp as well as syncing the container timestamp and file timestamps. The current state is that we don't sync these two regardless of useCurrentTimestamp:

  • If useCurrentTimestamp is false, the container timestamp is the epoch, while the file timestamps are epoch+1 sec.
  • If true, only the container timestamp is set. The file timestamps remain epoch+1 sec.

The first step would be to change this behavior and make them always sync'ed.

@chanseokoh
Copy link
Member

One downside to syncing the container timestamp and file timestamps (only when using the current timestamp): with the fixed file timestamp of epoch+1, all layers remain unchanged and doesn't have to be rebuilt every time when useCurrentTimestamp is true. This may be a good argument not to sync the container timestamp and file timestamp, although this won't be a problem if the user specifies a fixed or infrequently changing timestamp. Therefore, people shouldn't really use the current timestamp, but it will be tough to teach them not to do it. I sympathize with people's natural desire to tag images with the current timestamp, since it feels so natural to do. Perhaps we should warn more aggressively when using the current timestamp that is always changing.

@TadCordle
Copy link
Contributor

Since the original bug this issue reported was fixed, and some new discussions started up out of it, we're going to close this issue and split it off into smaller issues. See #1607, #1608, and #1609.

@chanseokoh
Copy link
Member

chanseokoh commented Apr 8, 2019

@mbonato @ST-DDT @nobecutan @devinrsmith you all said you'd want to be able to configure a timestamp for either the container or the files (or both). We'd appreciate if could you go to #1607, #1608, and/or #1609 and provide some details about your use case for configuring the container timestamp and/or file timestamps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests