[Community Event] Doc Tests Sprint #16292

patrickvonplaten · 2022-03-21T11:53:20Z

This issue is part of our Doc Test Sprint. If you're interested in helping out come join us on Discord and talk with other contributors!

Docstring examples are often the first point of contact when trying out a new library! So far we haven't done a very good job at ensuring that all docstring examples work correctly in 🤗 Transformers - but we're now very dedicated to ensure that all documentation examples work correctly by testing each documentation example via Python's doctest (https://docs.python.org/3/library/doctest.html) on a daily basis.

In short we should do the following for all models for both PyTorch and Tensorflow:

- Check the current doc examples will run without failure
- Check whether the current doc example of the forward method is a sensible example to better understand the model or whether it can be improved. E.g. is the example of https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/bert#transformers.BertForQuestionAnswering.forward a good example of the model? Could it be improved?
- Add an expected output to the doc example and test it via Python's doc test (see Guide to contributing below)

Adding a documentation test for a model is a great way to better understand how the model works, a simple (possibly first) contribution to Transformers and most importantly a very important contribution to the Transformers community 🔥

If you're interested in adding a documentation test, please read through the Guide to contributing below.

This issue is a call for contributors, to make sure docstring exmaples of existing model architectures work correctly. If you wish to contribute, reply in this thread which architectures you'd like to take :)

Guide to contributing:

Ensure you've read our contributing guidelines 📜
Claim your architecture(s) in this thread (confirm no one is working on it) 🎯
Implement the changes as in add doctests for bart like seq2seq models #15987 (see the diff on the model architectures for a few examples) 💪
- The file you want to look at is in src/transformers/models/[model_name]/modeling_[model_name].py, src/transformers/models/[model_name]/modeling_tf_[model_name].py or src/transformers/doc_utils.py or src/transformes/file_utils.py
- Make sure to run the doc example doc test locally as described in https:/huggingface/transformers/tree/master/docs#for-python-files
- Optionally, change the example docstring to a more sensible example that gives a better suited result
- Make the test pass
- Add the file name to https:/huggingface/transformers/blob/master/utils/documentation_tests.txt (making sure the file stays in alphabetical order)
- Run the doc example test again locally
In addition, there are a few things we can also improve, for example :
- Fix some style issues: for example, change ``decoder_input_ids``` to `decoder_input_ids`.
- Using a small model checkpoint instead of a large one: for example, change "facebook/bart-large" to "facebook/bart-base" (and adjust the expected outputs if any)
Open the PR and tag me @patrickvonplaten @ydshieh or @patil-suraj (don't forget to run make fixup before your final commit) 🎊
- Note that some code is copied across our codebase. If you see a line like # Copied from transformers.models.bert..., this means that the code is copied from that source, and our scripts will automatically keep that in sync. If you see that, you should not edit the copied method! Instead, edit the original method it's copied from, and run make fixup to synchronize that across all the copies. Be sure you installed the development dependencies with pip install -e ".[dev]", as described in the contributor guidelines above, to ensure that the code quality tools in make fixup can run.

PyTorch Model Examples added to tests:

Tensorflow Model Examples added to tests:

The text was updated successfully, but these errors were encountered:

reichenbch · 2022-03-21T12:35:23Z

@patrickvonplaten I would like to start with Maskformer for Tensorflow/Pytorch. Catch up with how the event goes.

patrickvonplaten · 2022-03-21T13:24:15Z

Awesome! Let me know if you have any questions :-)

KMFODA · 2022-03-21T15:08:18Z

Hello! I'd like to take on Longformer for Tensorflow/Pytorch please.

MarkusSagen · 2022-03-21T20:26:27Z

@patrickvonplaten I would like to start with T5 for pytorch and tensorflow

patrickvonplaten · 2022-03-22T00:12:13Z

Sounds great!

patrickvonplaten · 2022-03-22T00:14:34Z

LayoutLM is also taken as mentioned by a contributor on Discord!

cakiki · 2022-03-22T14:16:15Z

@patrickvonplaten I would take GPT and GPT-J (TensorFlow editions) if those are still available.

I'm guessing GPT is GPT2?

vumichien · 2022-03-22T14:38:19Z

I will take Bert, Albert, and Bigbird for both Tensorflow/Pytorch

johko · 2022-03-22T15:21:33Z

I'll take Swin and ViT for Tensorflow

jmwoloso · 2022-03-22T16:32:05Z

I'd like DistilBERT for both TF and PT please

ydshieh · 2022-03-22T16:44:28Z

@patrickvonplaten I would take GPT and GPT-J (TensorFlow editions) if those are still available.

I'm guessing GPT is GPT2?

@cakiki You can go for GPT2 (I updated the name in the test)

ArEnSc · 2022-03-23T02:28:18Z

Can I try GPT2 and GPTJ for Pytorch? if @ydshieh you are not doing so?

Aanisha · 2022-03-23T06:56:34Z

I would like to try CLIP for Tensorflow and PyTorch.

NielsRogge · 2022-03-23T07:23:55Z

I'll take CANINE and TAPAS.

ydshieh · 2022-03-23T07:44:07Z

Can I try GPT2 and GPTJ for Pytorch? if @ydshieh you are not doing so?

@ArEnSc
No, you can work on these 2 models :-) Thank you!

vumichien · 2022-03-23T08:12:03Z

@ydshieh Since the MobileBertForSequenceClassification is the copy of BertForSequenceClassification, so I think I will do check doc-test of MobileBert as well to overcome the error from make fixup

abdouaziz · 2022-03-23T08:51:02Z

I'll take FlauBERT and CamemBERT.

ydshieh · 2022-03-23T09:14:55Z

@abdouaziz Awesome! Do you plan to work on both PyTorch and TensorFlow versions, or only one of them?

Tegzes · 2022-03-23T10:32:34Z

I would like to work on LUKE model for both TF and PT

NielsRogge · 2022-03-23T10:34:20Z

@Tegzes you're lucky because there's no LUKE in TF ;) the list above actually just duplicates all models, but many models aren't available yet in TF.

Tegzes · 2022-03-23T10:34:35Z

In this case, I will also take DeBERTa and DeBERTa-v2 for PyTorch

abdouaziz · 2022-03-23T11:18:43Z

@ydshieh

I plan to work only with PyTorch

patrickvonplaten · 2022-03-23T12:28:03Z

@Tegzes you're lucky because there's no LUKE in TF ;) the list above actually just duplicates all models, but many models aren't available yet in TF.

True - sorry I've been lazy at creating this list!

arnaudstiegler · 2022-03-23T13:55:36Z

Happy to work on TrOCR (pytorch and TF)

patrickvonplaten · 2022-03-23T13:58:16Z

I take RoBERTa in PT and TF

AbinayaM02 · 2022-03-23T14:24:54Z

I would like to pick up XLM-RoBERTa in PT and TF.

bhadreshpsavani · 2022-03-23T14:40:21Z

I can work on ELECTRA for PT and TF

stevenmanton · 2022-09-20T22:31:13Z

I'll work on perceiver.

RP2025 · 2022-10-04T07:10:28Z

hello, I would love to contribute to encoder for PT and TF
Thankyou @sgugger

SauravMaheshkar · 2022-10-06T11:26:24Z

I'd like to try Reformer for PyTorch and Tensorflow ☕

soma2000-lang · 2022-10-10T21:03:41Z

I would to try for Data2VecText @patrickvonplaten

ydshieh · 2022-10-11T09:01:50Z

@SauravMaheshkar It looks PyTorch Reformer is already done, see here

Or do you mean docs/source/en/model_doc/reformer.mdx?

traveler-pinkie · 2022-10-13T03:06:04Z

@patrickvonplaten I would like to work on Marian for TensorFlow please. Thank You

RamitPahwa · 2022-10-13T03:33:02Z

I would like to work on OpenAI for Pytorch and Tensorflow @ydshieh

soma2000-lang · 2022-10-18T09:42:16Z

@ydshieh I am working on clip model

vedikajain2004 · 2023-10-01T12:57:03Z

@patrickvonplaten I'd like to try working on the CamemBERT model for TensorFlow

BarnikRB · 2023-10-02T21:49:08Z

@patrickvonplaten I'd like to work on ImageGPT for pytorch

ydshieh · 2023-10-03T07:35:22Z

The list in this thread is one year old and outdated, as well as some guidelines. I will have to make some update.

imsoumya18 · 2023-10-05T03:35:44Z

@patrickvonplaten I want to work on FNet

CodeGovindz · 2023-10-10T10:18:11Z

could you please assign this issue to me

asarthaks · 2023-12-15T14:17:45Z

@ydshieh Can I work on the CLIP model for PT, if no one is working on it.

ydshieh · 2023-12-15T14:38:48Z

Hi @asarthaks Thank your for the interest on this.

The list is outdated, and CLIP might likely no longer require the changes.

If you find anything in it needs an update, go ahead :-)

Epik-Whale463 · 2024-10-01T04:50:15Z

Want to work on the RAG , can you assign it to me

0xSaurabhx · 2024-10-02T12:13:07Z

I would like to work on the TensorFlow models for CamemBERT and Canine.

b423016 · 2024-10-02T17:26:07Z

I would love to work on ImageGPT . Please assign it to me

AnyigorTobias · 2024-10-03T18:59:39Z

Hello @ydshieh, I will like to work on distilbert.
Can you assign that to me?

ydshieh · 2024-10-04T10:02:33Z

Hi all. @Epik-Whale463 @0xSaurabhx @b423016 @AnyigorTobias

This sprint was 2 years old and the instructions may no longer be valid. I will try to check the status.

saipavanmeruga · 2024-10-08T14:32:26Z

Hello @ydshieh, I am looking for a good first issue to contribute. This is a gentle ping to see if the sprint is still active.

patrickvonplaten added the Good First Issue label Mar 21, 2022

patrickvonplaten changed the title ~~Doc tests sprint~~ [Community Event] Doc Tests Sprint Mar 21, 2022

patrickvonplaten pinned this issue Mar 21, 2022

Tegzes mentioned this issue Jul 2, 2022

Added Doctest for Deberta Pytorch #17997

Closed

5 tasks

oneraghavan mentioned this issue Jul 21, 2022

Add canine in documentation_tests_file #18225

Closed

5 tasks

Tegzes mentioned this issue Aug 12, 2022

Added Docstrings for Deberta and DebertaV2 [PyTorch] #18610

Merged

5 tasks

stevenmanton mentioned this issue Sep 20, 2022

Add doctests to Perceiver examples #19129

Merged

5 tasks

sgugger added the HACKTOBERFEST-ACCEPTED label Oct 3, 2022

ydshieh mentioned this issue Oct 11, 2022

🔥[Community Event] Doc Tests Sprint - Configuration files🔥 #19487

Closed

traveler-pinkie mentioned this issue Oct 15, 2022

Marian docstring #19634

Closed

5 tasks

Tegzes mentioned this issue Nov 18, 2022

Added Luke Doctests #20324

Closed

5 tasks

ydshieh removed the HACKTOBERFEST-ACCEPTED label Oct 2, 2024

[Community Event] Doc Tests Sprint #16292

[Community Event] Doc Tests Sprint #16292

Comments

patrickvonplaten commented Mar 21, 2022 • edited by ydshieh Loading

This issue is part of our Doc Test Sprint. If you're interested in helping out come join us on Discord and talk with other contributors!

Guide to contributing:

PyTorch Model Examples added to tests:

Tensorflow Model Examples added to tests:

reichenbch commented Mar 21, 2022

patrickvonplaten commented Mar 21, 2022

KMFODA commented Mar 21, 2022

MarkusSagen commented Mar 21, 2022

patrickvonplaten commented Mar 22, 2022

patrickvonplaten commented Mar 22, 2022

cakiki commented Mar 22, 2022 • edited Loading

vumichien commented Mar 22, 2022

johko commented Mar 22, 2022

jmwoloso commented Mar 22, 2022

ydshieh commented Mar 22, 2022

ArEnSc commented Mar 23, 2022

Aanisha commented Mar 23, 2022

NielsRogge commented Mar 23, 2022

ydshieh commented Mar 23, 2022 • edited Loading

vumichien commented Mar 23, 2022

abdouaziz commented Mar 23, 2022

ydshieh commented Mar 23, 2022

Tegzes commented Mar 23, 2022

NielsRogge commented Mar 23, 2022

Tegzes commented Mar 23, 2022 • edited Loading

abdouaziz commented Mar 23, 2022 • edited Loading

patrickvonplaten commented Mar 23, 2022

arnaudstiegler commented Mar 23, 2022

patrickvonplaten commented Mar 23, 2022

AbinayaM02 commented Mar 23, 2022

bhadreshpsavani commented Mar 23, 2022 • edited Loading

stevenmanton commented Sep 20, 2022

RP2025 commented Oct 4, 2022

SauravMaheshkar commented Oct 6, 2022

soma2000-lang commented Oct 10, 2022 • edited Loading

ydshieh commented Oct 11, 2022 • edited Loading

traveler-pinkie commented Oct 13, 2022

RamitPahwa commented Oct 13, 2022

soma2000-lang commented Oct 18, 2022

vedikajain2004 commented Oct 1, 2023

BarnikRB commented Oct 2, 2023

ydshieh commented Oct 3, 2023

imsoumya18 commented Oct 5, 2023

CodeGovindz commented Oct 10, 2023

asarthaks commented Dec 15, 2023

ydshieh commented Dec 15, 2023

Epik-Whale463 commented Oct 1, 2024

0xSaurabhx commented Oct 2, 2024

b423016 commented Oct 2, 2024

AnyigorTobias commented Oct 3, 2024

ydshieh commented Oct 4, 2024

saipavanmeruga commented Oct 8, 2024

patrickvonplaten commented Mar 21, 2022 •

edited by ydshieh

Loading

cakiki commented Mar 22, 2022 •

edited

Loading

ydshieh commented Mar 23, 2022 •

edited

Loading

Tegzes commented Mar 23, 2022 •

edited

Loading

abdouaziz commented Mar 23, 2022 •

edited

Loading

bhadreshpsavani commented Mar 23, 2022 •

edited

Loading

soma2000-lang commented Oct 10, 2022 •

edited

Loading

ydshieh commented Oct 11, 2022 •

edited

Loading