Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New command to generate launch script #13

Merged
merged 4 commits into from
Dec 1, 2023

Conversation

petebankhead
Copy link
Member

Add a new command to generate a launch script that can set environment variables and system properties.

Add a new command to generate a launch script that can set environment variables and system properties.
@petebankhead
Copy link
Member Author

I think this now works (at least on my computers).

You should:

  • Build QuPath from source (gradlew clean jpackage)
  • Build the DJL extension from this PR (you may need gradlew clean build --refresh-dependences because of snapshots)
  • Launch QuPath & install the extension
  • Use Extensions → Deep Java Library → Create launch script
  • Point to your conda environment, specify the PyTorch version (if you want) and click ok to save a launch script
  • Try to start QuPath using the launch script
  • Use Extensions → Deep Java Library → Manage DJL engines to check CUDA is available & you can download an engine with GPU support
    • To check GPU support is available, in the 'manage' dialog hover the mouse over the version and check the capabilities listed in the tooltip

@Rylern @alanocallaghan @finglis Please check if you can, while I update the docs here.

@Rylern
Copy link

Rylern commented Dec 1, 2023

The GPU is still not detected on my computer.

When I install PyTorch, I get this warning:
image

@alanocallaghan
Copy link
Contributor

PyTorch works fine for me and finds my CUDA, but then it found it fine before as well.

For TensorFlow, confusingly, I get an error, and then it claims to be available. Stacktrace:

11:25:50.464 [JavaFX Application Thread] [ERROR] qupath.ext.djl.ui.DjlEngineCommand - Error updating engine version: Failed to download Tensorflow native library
java.lang.IllegalStateException: Failed to download Tensorflow native library
	at ai.djl.tensorflow.engine.javacpp.LibUtils.downloadTensorFlow(LibUtils.java:207)
	at ai.djl.tensorflow.engine.javacpp.LibUtils.findLibraryInClasspath(LibUtils.java:91)
	at ai.djl.tensorflow.engine.javacpp.LibUtils.getLibName(LibUtils.java:66)
	at ai.djl.tensorflow.engine.TfEngine.toString(TfEngine.java:177)
	at qupath.ext.djl.ui.DjlEngineCommand.updateVersionFromStatus(DjlEngineCommand.java:271)
	at qupath.ext.djl.ui.DjlEngineCommand.lambda$init$6(DjlEngineCommand.java:236)
	at com.sun.javafx.binding.ExpressionHelper$Generic.fireValueChangedEvent(ExpressionHelper.java:360)
	at com.sun.javafx.binding.ExpressionHelper.fireValueChangedEvent(ExpressionHelper.java:80)
	at javafx.beans.property.ObjectPropertyBase.fireValueChangedEvent(ObjectPropertyBase.java:106)
	at javafx.beans.property.ObjectPropertyBase.markInvalid(ObjectPropertyBase.java:113)
	at javafx.beans.property.ObjectPropertyBase.set(ObjectPropertyBase.java:147)
	at qupath.ext.djl.ui.DjlEngineCommand.updateStatus(DjlEngineCommand.java:363)
	at qupath.ext.djl.ui.DjlEngineCommand.lambda$updateStatus$8(DjlEngineCommand.java:365)
	at com.sun.javafx.application.PlatformImpl.lambda$runLater$10(PlatformImpl.java:456)
	at java.base/java.security.AccessController.doPrivileged(Unknown Source)
	at com.sun.javafx.application.PlatformImpl.lambda$runLater$11(PlatformImpl.java:455)
	at com.sun.glass.ui.InvokeLaterDispatcher$Future.run(InvokeLaterDispatcher.java:95)
	at com.sun.glass.ui.gtk.GtkApplication._runLoop(Native Method)
	at com.sun.glass.ui.gtk.GtkApplication.lambda$runLoop$11(GtkApplication.java:316)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Offline model is enabled.
	at ai.djl.util.Utils.openUrl(Utils.java:486)
	at ai.djl.util.Utils.openUrl(Utils.java:472)
	at ai.djl.tensorflow.engine.javacpp.LibUtils.downloadTensorFlow(LibUtils.java:176)
	... 19 common frames omitted

@petebankhead
Copy link
Member Author

petebankhead commented Dec 1, 2023

I need more info... not sure which platforms you're running on, or how you're assessing that CUDA was found. CUDA can be detected and show in the Manage engines dialog, even when it's completely incompatible with both engines.

I can replicate the TensorFlow problem on Windows - looking into it now.

@alanocallaghan
Copy link
Contributor

I need more info... not sure which platforms you're running on, or how you're assessing that CUDA was found. CUDA can be detected and show in the Manage engines dialog, even when it's completely incompatible with both engines.

Running on Linux. Knowing that CUDA was found because this laptop can't do 70 tiles/s on the CPU

Ignore what I said, it was picking up my existing DJL download. If I make a launcher script just pointing to a conda environment, DJL just ignores it and downloads pytorch to the normal place. If I also specify the path to torch, then I get

11:59:49.658 [JavaFX Application Thread] [INFO ] ai.djl.pytorch.jni.LibUtils - Downloading jni https://publish.djl.ai/pytorch/1.13.1/jnilib/0.24.0/linux-x86_64/cu123-precxx11/libdjl_torch.so to cache ...
11:59:49.659 [JavaFX Application Thread] [ERROR] q.ext.wsinfer.ui.PytorchManager - Cannot download jni files: https://publish.djl.ai/pytorch/1.13.1/jnilib/0.24.0/linux-x86_64/cu123-precxx11/libdjl_torch.so
ai.djl.engine.EngineException: Cannot download jni files: https://publish.djl.ai/pytorch/1.13.1/jnilib/0.24.0/linux-x86_64/cu123-precxx11/libdjl_torch.so

I can't get it to pick up anything bar its preferred version/location, but then its preferred location works fine on my box, so...

@petebankhead
Copy link
Member Author

petebankhead commented Dec 1, 2023

Yeah, I think if it works for its preferred download & location that's enough. Maybe we should remove the other option, although I think I've seen it work at some point in the past.

The TensorFlow thing is a bit maddening, but I think I'm getting closer.

Basically the Engine can be valid but simply calling toString() on it is enough to prompt a download here.

This is because it looks for the library name based upon what it expects to have to match with the current CUDA, even if that doesn't exist and isn't actually the library name of the engine. Rather, it finds a placeholder here and so wants to attempt a download, but can't because we're in offline mode.

The wrong CUDA/library name is corrected at the download stage (i.e. it recognizes that it should fall back to cpu), but only if it's allowed to proceed (i.e. we're not in offline mode).

So basically it looks like we'd need to turn off offline mode in order for it to start figuring out what to download, then realize that it doesn't need to download anything.

Not sure how to fix it satisfyingly... but the conclusion is that it should work as long as there is either 1) no CUDA, or 2) a compatible CUDA found, or 3) we're not enforcing offline.

@petebankhead petebankhead merged commit a886a4a into qupath:main Dec 1, 2023
1 check passed
@petebankhead petebankhead deleted the launch-script branch December 1, 2023 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants