Update no_trainer scripts with new Accelerate functionalities #16617

muellerzr · 2022-04-05T20:58:54Z

Update the `no_trainer` scripts to keep aligned with Accelerate capabilities

What does this add?

Updates all no_trainer scripts to use the latest capabilities.

Why is it needed?

Accelerate had a number of new capabilities added, including better saving/loading, experiment tracking, and support for LR Schedulers. As a result, much of the current scripts can either be simplified from their hard-coded behaviors, or have these features added

Modified scripts with potential major changes:

language-modeling
multiple-choice
question-answering
summarization
text-classification
token-classification
translation

The speech fine-tuning will be updated in a later PR

Basic usage examples:

Saving checkpoints each epoch or number of steps:

accelerate launch language-modeling/run_clm_no_trainer --checkpointing_steps "epoch"

accelerate launch language-modeling/run_clm_no_trainer --checkpointing_steps 100

Resuming training from a saved checkpoint:

accelerate launch language-modeling/run_clm_no_trainer --resume_from_checkpoint "epoch_1"

Use any available trackers that Accelerate can automatically pick up including Weights and Biases, TensorBoard, and CometML

accelerate launch language_modeling/run_clm_no_trainer --with_tracking

Anticipated maintence burden? (What will happen in say, 3 months if something changes)

As it gets more widly used, these scripts might need small updates if we find the end-users prefer a different experience when it comes to logging, or other small bugfixes we find as time goes on.

HuggingFaceDocBuilderDev · 2022-04-05T21:14:00Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks for working on those examples! Left a couple of comments, but looks pretty good already!

examples/pytorch/language-modeling/run_clm_no_trainer.py

Co-authored-by: Sylvain Gugger <[email protected]>

sgugger

Thanks for duplicating the effort across all the examples!

Update first script

a4174b6

muellerzr requested a review from sgugger April 5, 2022 20:59

sgugger approved these changes Apr 5, 2022

View reviewed changes

examples/pytorch/language-modeling/run_clm_no_trainer.py Outdated Show resolved Hide resolved

examples/pytorch/language-modeling/run_clm_no_trainer.py Outdated Show resolved Hide resolved

examples/pytorch/language-modeling/run_clm_no_trainer.py Outdated Show resolved Hide resolved

muellerzr and others added 7 commits April 6, 2022 10:29

Reduce to a one-liner

041cbf6

Co-authored-by: Sylvain Gugger <[email protected]>

Wrap up nits

b8c9418

mlm + swag

0451ded

qa examples

58be84c

Summarization

66a5739

Token and text classification

270bed1

Translation

2d94583

muellerzr added Examples Which is related to examples in general External Using the library with external tools (onnx, tflite, ...) labels Apr 6, 2022

muellerzr changed the title ~~Update no_trainer scripts with new Accelerate functionalities [DRAFT]~~ Update no_trainer scripts with new Accelerate functionalities Apr 6, 2022

muellerzr added the PyTorch Anything PyTorch label Apr 6, 2022

muellerzr marked this pull request as ready for review April 6, 2022 16:04

muellerzr requested a review from sgugger April 6, 2022 16:05

sgugger approved these changes Apr 6, 2022

View reviewed changes

muellerzr merged commit febe42b into main Apr 6, 2022

muellerzr deleted the muellerzr-update-no-trainer branch April 6, 2022 19:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update no_trainer scripts with new Accelerate functionalities #16617

Update no_trainer scripts with new Accelerate functionalities #16617

muellerzr commented Apr 5, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 5, 2022 •

edited

Loading

sgugger left a comment

sgugger left a comment

Update no_trainer scripts with new Accelerate functionalities #16617

Update no_trainer scripts with new Accelerate functionalities #16617

Conversation

muellerzr commented Apr 5, 2022 • edited Loading

Update the no_trainer scripts to keep aligned with Accelerate capabilities

What does this add?

Why is it needed?

Basic usage examples:

Anticipated maintence burden? (What will happen in say, 3 months if something changes)

HuggingFaceDocBuilderDev commented Apr 5, 2022 • edited Loading

sgugger left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

muellerzr commented Apr 5, 2022 •

edited

Loading

Update the `no_trainer` scripts to keep aligned with Accelerate capabilities

HuggingFaceDocBuilderDev commented Apr 5, 2022 •

edited

Loading