Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request for Upcoming Refactoring #9

Open
dkmisra opened this issue Jan 4, 2024 · 0 comments
Open

Feature Request for Upcoming Refactoring #9

dkmisra opened this issue Jan 4, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@dkmisra
Copy link
Collaborator

dkmisra commented Jan 4, 2024

This is the issue that contains list of all features for the upcoming refactoring:

  1. A unified abstract class that does all common stuff like create command line arguments, make an LLM, and run the experiment. We may have only 1 file per LLM (or per LLM type) and this abstract class. We may not be able to get it down to a single file since certain LLMs like Roberta which are really Masked Language Models have a different procedure to computing accuracy and log-loss using the tokens.

  2. Replace the use of rate with ρ which is used in the paper.

  3. Add a feature to support memory reduction by storing separate U, S, V matrices rather than multiplying them back and loosing the memory advantage.

  4. Add more LLMs, specifically, Mistral and other Llama2 versions and Phi models.

  5. Release LLMs with optimally chosen reductions from Table3 of the paper https://arxiv.org/pdf/2312.13558.pdf.

If you have more requests, please paste them below. Do note that the first version of refactoring may not be able to do all of the above, but we'll do our best. We welcome PRs.

@dkmisra dkmisra added the enhancement New feature or request label Jan 4, 2024
@dkmisra dkmisra self-assigned this Jan 4, 2024
This was referenced Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant