Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(kernel-memory): avoid loading model twice. #248

Merged
merged 1 commit into from
Nov 5, 2023

Conversation

AsakusaRinne
Copy link
Collaborator

@AsakusaRinne AsakusaRinne commented Nov 5, 2023

@xbotter Please help to review it. Thank you!

related with: #35 (comment)

Copy link
Collaborator

@xbotter xbotter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I noticed that there is an EmbeddingMode in ModelParams. Will it have any impact on the embedding?

@martindevans
Copy link
Member

I'm not certain but I think embedding mode loads the model in a way that can only do embedding.

@AsakusaRinne
Copy link
Collaborator Author

Martin's right. Maybe one of the reasons behind it is the kv cache. If only using embedding mode, it seems no need for kv-cache. However what confused me is that why a model for inference cannot be used to generate embeddings.

@AsakusaRinne AsakusaRinne merged commit a9434c2 into SciSharp:master Nov 5, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants