Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Todo: Runtime Feature Detection #251

Closed
martindevans opened this issue Nov 5, 2023 · 3 comments
Closed

Todo: Runtime Feature Detection #251

martindevans opened this issue Nov 5, 2023 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@martindevans
Copy link
Member

martindevans commented Nov 5, 2023

A while ago a new system was added into LLamaSharp with #65. This feature tries to load binaries in order of preference,w ith runtime checks for if that binary is supported. Although this has been in for a while it's never actually been used, the binaries need rearranging into a specific folder structure for it to work.

This has two benefits.

  • On PCs which support wider SIMD (e.g. AVX512) it will be much faster, since it can detect that at runtime and load binaries which use that feature.
  • Our current default binaries are AVX2. On PCs which do not support this it can fall back to AVX, or even no SIMD at all.

Required Layout

CUDA 12 Backend

  • cu12.1.0/libllama.dll
  • cu12.1.0/libllama.so

CUDA 11 backend

  • cu11.7.1/libllama.dll
  • cu11.7.1/libllama.so

CPU Backend

  • avx512/libllama.dll
  • avx512/libllama.so
  • avx2/libllama.dll
  • avx2/libllama.so
  • avx/libllama.dll
  • avx/libllama.so
  • libllama.dll (this is the default if all others fail, i.e. no SIMD at all)
  • libllama.so (this is the default if all others fail, i.e. no SIMD at all)
  • libllama.dylib (ARM64)
  • ggml-metal.metal (must be next to libllama.dylib)

Possible Extensions

There are some other features this could support:

  • Architecture detection:
    • ARM64 Linux (e.g. Raspberry Pi, Android)
    • MacOS Intel
  • OpenBLAS detection (if we can find a way to detect it at runtime)
  • CUDA detection
    • If we could detect CUDA at runtime we could put all of the backends into one package

Help Required

I don't know how to update all of the projects (and most importantly, the nuget packages) to work with this new layout. It shouldn't be too complicated, it just needs some configuration I haven't done before (and I don't want to fiddle with, since it might break the nuget release).

@AsakusaRinne
Copy link
Collaborator

I'd like to help with the nuget package part. Theoretically what the nuget package does is only copying binaries to the output directory of which depend on it. Therefore I think it won't be difficult if the runtime detection is well done.

@SignalRT
Copy link
Collaborator

SignalRT commented Nov 5, 2023

I will see if I can manage to detect MacOS Intel / ARM

@martindevans
Copy link
Member Author

System.Runtime.InteropServices.RuntimeInformation.ProcessArchitecture should work for that I think

@martindevans martindevans added enhancement New feature or request help wanted Extra attention is needed labels Nov 6, 2023
@SignalRT SignalRT mentioned this issue Nov 6, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants