Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate newFileSystem(...) overloads that default to platform-specific settings #106

Open
lukehutch opened this issue Apr 27, 2020 · 11 comments
Labels
P3 type=defect Bug, not working as expected

Comments

@lukehutch
Copy link

The following code

try (FileSystem memFs = Jimfs.newFileSystem()) {
    Path memFsRoot = memFs.getPath("");
    URI uri = memFsRoot.toUri();
    Path path = Paths.get(uri);
    System.out.println("memFsRoot: " + memFsRoot);
    System.out.println("uri: " + uri);
    System.out.println("path: " + path);
}

prints, on Windows:

memFsRoot: 
uri: jimfs://4dbef289-e533-4340-8531-fe2cc6b2f0f3/C:/work/
path: C:\work

This seems like a bug (even though the Path object is linked to the FileSystem instance). The path should not need the C: drive designation, as far as I can tell, because the GUID in the URI should be sufficient to locate the filesystem. The weird Jimfs URI and Path syntax makes porting code between Linux and Windows problematic.

On Linux the output is:

memFsRoot: 
uri: jimfs://4dbef289-e533-4340-8531-fe2cc6b2f0f3/work/
path: /work
@cpovirk
Copy link
Member

cpovirk commented Apr 27, 2020

I am not as familiar as I should be with either (a) Path under Windows or (b) jimfs support for Windows, but, as I understand it:

@lukehutch
Copy link
Author

I understand the behavior of Java's built-in Windows FileSystem, and this is necessary for interoperation with the OS' filesystem path format.

However, as far as I can tell, it was unnecessary for Jimfs to mimic this behavior right in the middle of a jimfs: URL, when even without this drive designation, the URL would already be unique.

The less OS-specific behavior that a Java library exposes through its API, the more portable the library is. Based on the link you provided, it seems that this was an intentional choice. However, it's a very odd choice, and as I pointed out, can cause interoperability issues.

In my case, my library assumed that everything after the GUID in a jimfs: path would be an absolute Unix-style path. I worked around this Jimfs quirk in my library, but it's a shame to have to add special-case logic for a library that should probably not be exposing OS-specific things in its API.

@cpovirk
Copy link
Member

cpovirk commented Apr 27, 2020

Having jimfs default to the behavior of the system it's running on definitely has downsides :(

In your case, would Jimfs.newFileSystem(Configuration.unix()) sidestep the problem?

@lukehutch
Copy link
Author

lukehutch commented Apr 27, 2020

Yes, that works fine, and makes the behavior on Windows and Linux exactly the same. Thanks for the suggestion. Although I still need to handle the Windows default configuration, since users of my library likely aren't calling Jimfs.newFileSystem(Configuration.unix()).

Is there any advantage whatsoever to not using the Unix behavior as the default on all operating systems? It seems like the whole development world has come around to use Unix conventions (even Microsoft, by introducing WSL). Actually Unix style paths work perfectly fine with all Java APIs on Windows.

The Unix default provides a much more straightforward mapping between Jimfs URLs and paths.

@seanpoulter
Copy link

Hi Luke. I'm not involved with the jimfs project but have been watching the repo. Here's my two cents. You've said:

users of my library likely aren't calling Jimfs.newFileSystem(Configuration.unix())

Without knowing anything about what you're trying to do, can I ask why the folks using your library need to know about jimfs? It sounds like you've got a great opportunity to wrap jimfs and hide these implementation details. You could also give them a factory function that returns a file system configured "the right way".

Good luck!

@lukehutch
Copy link
Author

@seanpoulter A user requested Jimfs support for the library I created, ClassGraph.

Since a Jimfs Path can be turned into a URL (with the underlying FileSystem instance registered against the URL scheme), it is plausible that any library that can handle URL objects may be sent a URL from Jimfs.

Most of the time libraries probably don't need to do anything with the URL other than maybe opening a connection to it. However, in ClassGraph paths are significant, since a Java package hierarchy (the directory or URL path structure) needs to match the fully-qualified class name of any classes that are scanned. Hence ClassGraph has to look at the URL path. The presence of a drive letter in the path was throwing off ClassGraph's original naive implementation of FileSystem-backed URL support. I put a much more robust and generic solution in place in the latest release that works properly with the Jimfs URL syntax.

However I still find the syntax truly weird, and I don't see the purpose of allowing a drive letter in Jimfs, since that is only supported on Windows -- and every library written for Java should strive to be as cross-platform as possible! So if Jimfs-internal drive letters are not supported on Mac OS X or Linux or other Unices, they probably should not be supported on Windows.

@seanpoulter
Copy link

Thanks for the context. That is very clear and makes a lot more sense to me. The drive letter does not make sense in the URL like that. 😓

I found Jimfs looking for a way to mock the file system for an integration test. Depending on how thorough folks are testing, I'd expect some folks need that drive letter support.

By the way, congrats on the great looking tool and zero bug policy. ClassGraph looks really interesting. 👏

@lukehutch
Copy link
Author

I found Jimfs looking for a way to mock the file system for an integration test. Depending on how thorough folks are testing, I'd expect some folks need that drive letter support.

That actually makes sense as a reasonable usecase, since if you just call toString on a Jimfs Path, you don't get the GUID, only the path listed after the GUID in the URL.

However, I would still suggest making the Unix configuration the default for all operating systems, in the name of interoperability. Then if people want to test using Windows-like paths, they can enable that with one call.

By the way, congrats on the great looking tool and zero bug policy. ClassGraph looks really interesting.

Thanks! The library was born out of necessity, but it has been useful for a lot of people, since the JDK does not provide a "metamodel" natively yet. ClassGraph has been pretty stable for a couple of years now. The Zero Bugs Commitment was a reaction to spending 30 years reporting literally thousands of bugs in open source products, and seeing only maybe 1% of them fixed!

@cpovirk
Copy link
Member

cpovirk commented Apr 28, 2020

Oddly enough, we originally had no methods that silently used the default platform... but then we were convinced to add them :\ The logic seems to have been that most people aren't going to write cross-platform code, anyway, so we might as well let them easily switch from the real filesystem for their platform to a corresponding fake one. Maybe that logic holds for app developers (e.g., if you're writing an Android app), but it's highly questionable for library developers.

Given that all we're providing is the ability to write Jimfs.newFileSystem() instead of Jimfs.newFileSystem(Configuration.forCurrentPlatform()), this seems especially questionable. (OK, Configuration.forCurrentPlatform() didn't exist until jimfs 1.1, but we could have added it earlier.)

At this point, changing the default may be more harmful than helpful, especially in a world in which most build systems will silently let one of your dependencies update your version in jimfs in a way that breaks another dependency at runtime. I could imagine, though, that we could deprecate the no-arg Jimfs.newFileSystem(), telling people to explicitly select unix() or forCurrentPlatform() as they prefer. That would help protect against this particular landmine. What do you think?

@lukehutch
Copy link
Author

I think that would be fine. The deprecation would alert users to the different behaviors, so they can decide which one they want to use.

@lukehutch
Copy link
Author

One more oddity of the Jimfs URL format: the drive letter may cause problems if URLs are relativized, since it may be confused for a protocol: https://stackoverflow.com/a/1737589/3950982

@cpovirk cpovirk changed the title Strange URI format on Windows Deprecate newFileSystem(...) overloads that default to platform-specific settings Apr 28, 2020
@cpovirk cpovirk added P3 type=defect Bug, not working as expected labels Apr 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 type=defect Bug, not working as expected
Projects
None yet
Development

No branches or pull requests

3 participants