Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cuda feature detection. #275

Merged
merged 8 commits into from
Nov 12, 2023

Conversation

AsakusaRinne
Copy link
Collaborator

@AsakusaRinne AsakusaRinne commented Nov 11, 2023

This PR is supposed to be merged after #258

Note that it adds two APIs for users, which are NativeLibraryConfig.WithLibrary and NativeLibraryConfig.WithMatchRule. To be honest I think the naming is not good but I can't come up with a better one. Any idea?

TODO:

  • Check on Windows
  • Check on Linux
  • Check on MacOS
  • Add necessary logs
  • Change file structure in nuget package and verify it

@AsakusaRinne
Copy link
Collaborator Author

Is there a good way for us to print some logs in NativeApis now? I think directly outputting to console is not a good idea.

@martindevans
Copy link
Member

I think you can do something like Logger.GetLogger() now?

config.Desc = new Description(libraryPath);
}

/// <summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to split this into various WithX methods rather than having them all in one method (more extensible in the future).

e.g.

WithCuda(bool cuda)
WithAvx(AvxLevel avx)
WithAutoFallback(bool allow)
WithSkipCheck(bool skip)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This format seems good, especially for the ability of chained call. There's one problem that it couldn't notice user the default configuration well. If we just set default value to the parameter, it may looks strange when calling WithAutoFallback().WithSkipCheck(false). 😹

Anyway, I prefer this format, too. However is there any solution about the case above?

/// <param name="useCuda"></param>
/// <param name="avxLevel"></param>
/// <param name="allowFallback">Whether to allow fall-back when your hardware doesn't support your configuration.</param>
/// <param name="skipCheck">Whether to skip the check when fallback is allowed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what skipCheck does from this description. it's "skipping the check", but what check?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to check if the hardware support the user's choice. Let's ignore the skipCheck parameter first, then there're two basic conditions, allowing fallback and the opposite.
If fallback is allowed, we could choose another library if the user's configuration does not match the hardware. For example, the user specified cuda but there's actually no supported cuda version at that machine.

If fallback is disabled, however, we're supposed to load exactly the library specified. However the real condition may be complex and our logic may not cover all the cases (for example, linux cuda detection). Therefore if sometimes loading a library won't actually have problem but we take it as invalid by mistake, the user could force-load it without any check.

I think by this way we could leave a solution for some potential problems in the future.

@martindevans
Copy link
Member

Just a few review nits, but overall looks good.

My only criticism is it's less clear than before which binaries are preferred over others. Previously you could just read down the list of TryLoad calls and it was very obvious. Not a fatal flaw, just an observation!

@AsakusaRinne
Copy link
Collaborator Author

The linux check has passed but the binaries seemed not to work. I can only load the library when I replaced the binaries with self-compiled ones. Not sure if it's related with #270

@AsakusaRinne AsakusaRinne mentioned this pull request Nov 11, 2023
5 tasks
@AsakusaRinne
Copy link
Collaborator Author

@SignalRT Could you please help to test it on MAC? For nuget package test please download this file.

@AsakusaRinne AsakusaRinne marked this pull request as ready for review November 12, 2023 04:14
@SignalRT
Copy link
Collaborator

@AsakusaRinne I get this exception trying to load LLamaSharp:

System.BadImageFormatException: Could not load file or assembly 'LLamaSharp, Version=0.8.0.0, Culture=neutral, PublicKeyToken=null'. An attempt was ma...

System.BadImageFormatException
Could not load file or assembly 'LLamaSharp, Version=0.8.0.0, Culture=neutral, PublicKeyToken=null'. An attempt was made to load a program with an incorrect format.

at LLama.Unittest.BasicTest..ctor()
at System.RuntimeType.CreateInstanceDefaultCtor(Boolean publicOnly, Boolean wrapExceptions)

@AsakusaRinne
Copy link
Collaborator Author

AsakusaRinne commented Nov 12, 2023

@SignalRT It seems to be a problem of LLamaSharp package instead of backend packages. Is your system 32-bit or 64-bit?

These packages are the ones generated by github ci workflow, maybe another try with it?

@SignalRT
Copy link
Collaborator

@AsakusaRinne, Arm64 (M2 computer). I can run the test in the project, execute the examples, but with your packages or with the packages generated manually in my computer with prepare_release.sh it fails with the same error.

I'm trying to understand the problem....that seems a little weird.

@SignalRT
Copy link
Collaborator

@AsakusaRinne After a lot of problems, it seems that it's related to some problem with nuget cache. I clear the cache and it works with the packages generated in the CI pipeline

@AsakusaRinne
Copy link
Collaborator Author

@AsakusaRinne After a lot of problems, it seems that it's related to some problem with nuget cache. I clear the cache and it works with the packages generated in the CI pipeline

Wow, that's nice! I think we could release it tonight.

@martindevans The CUDA libraries have been replaced the ones with avx 2, is that right?

@martindevans
Copy link
Member

The CUDA libraries have been replaced the ones with avx 2, is that right?

Yep, the CUDA binaries should now compile with AVX2 by default.

@AsakusaRinne AsakusaRinne merged commit 17ec890 into SciSharp:master Nov 12, 2023
5 checks passed
@AsakusaRinne AsakusaRinne mentioned this pull request Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

3 participants