-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Eval] Get logits output. #319
Conversation
include/models.h
Outdated
@@ -70,6 +76,7 @@ class Model { | |||
std::vector<int32_t> inputIds; | |||
int batchSize; | |||
int seqLen; | |||
int vocabSize_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need vocabSize in Model class?
if needed, could you pls make it in same naming style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
int vocabSize = model->getVocabSize(); | ||
int logitsN = batchSize * seqLen * vocabSize; | ||
|
||
if (model->getRank() == 0) { input(inputIds); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the original code, why 'input' method is not called 'setInput', input is more like a keyword, :(
overall LGTM. @Duyi-Wang please help to review as it is closely related with the interface. |
I think it's better to keep align with output_scores (bool, optional, defaults to False) — Whether or not to return the prediction scores. See scores under returned tensors for more details. |
int sampleSize = std::get<2>(result); | ||
|
||
// Create a torch::Tensor from the C array | ||
int64_t tdims[3] = {batchSize, seqLen, vocabSize}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shape is correct? The decoder just returns the last token's logits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It opens the "logitsAll" to true for all logit output.
decoder->forward(..., logitsAll = true);
std::tuple<float *, int, int> result = model->forward(); | ||
float *outBuf = std::get<0>(result); | ||
int sampleOffset = std::get<1>(result); | ||
int sampleSize = std::get<2>(result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sync in multi-ranks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not support in this PR.
This implementation aligns with the approach on Hugging Face, which is merely a model's execution from input (token id) to output (logits), without considering the searcher component. |
No description provided.