Add Doc Test GPT-2 #16439

ArEnSc · 2022-03-28T01:35:54Z

What does this PR do?

Fixes the broken doc tests for GPT-2
Apart of the documentation sprint work.

Fixes [Github issue] (#16292)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

gpt2: @patrickvonplaten, @LysandreJik
Documentation: @sgugger

HuggingFaceDocBuilderDev · 2022-03-28T01:49:23Z

The documentation is not available anymore as the PR was closed or merged.

make fixup

ArEnSc · 2022-03-28T03:16:28Z

I think this is what is required something up with CI failing code quality check?

#!/bin/bash -eo pipefail
black --check examples tests src utils
Skipping .ipynb files as Jupyter dependencies are not installed.
You can fix this by running ``pip install black[jupyter]``
would reformat src/transformers/models/gpt2/modeling_gpt2.py

Oh no! 💥 💔 💥
1 file would be reformatted, 1510 files would be left unchanged.

Exited with code exit status 1
CircleCI received exit code 1

sgugger · 2022-03-28T14:33:10Z

Yes, you need to run make style on your branch to make that test pass :-)

Pinging @ydshieh on this PR since Patrick is on vacation this week.

ydshieh · 2022-03-28T15:32:16Z

Hi, @ArEnSc

Thank you for this PR!

In order to run make style, you will need to run

pip install transformers[quality]

If you haven't done this before.

ArEnSc · 2022-03-29T00:26:46Z

  python -m pytest -n 2 --dist=loadfile -s --make-reports=tests_new_models tests/bert_new/test_modeling_bert_new.py
  shell: /usr/bin/bash -e {0}
/usr/lib/python[3](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:3)/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.9) or chardet (3.0.[4](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:4)) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/runner/.local/lib/python3.8/site-packages/pytest/__main__.py", line [5](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:5), in <module>
    raise SystemExit(pytest.console_main())
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 187, in console_main
    code = main()
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 145, in main
    config = _prepareconfig(args, plugins)
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 324, in _prepareconfig
    config = pluginmanager.hook.pytest_cmdline_parse(
  File "/home/runner/.local/lib/python3.8/site-packages/pluggy/_hooks.py", line 2[6](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:6)5, in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
  File "/home/runner/.local/lib/python3.8/site-packages/pluggy/_manager.py", line 80, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/runner/.local/lib/python3.8/site-packages/pluggy/_callers.py", line 55, in _multicall
    gen.send(outcome)
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/helpconfig.py", line 102, in pytest_cmdline_parse
    config: Config = outcome.get_result()
  File "/home/runner/.local/lib/python3.8/site-packages/pluggy/_result.py", line 60, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/home/runner/.local/lib/python3.8/site-packages/pluggy/_callers.py", line 39, in _multicall
    res = hook_impl.function(*args)
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1016, in pytest_cmdline_parse
    self.parse(args)
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1304, in parse
    self._preparse(args, addopts=addopts)
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 118[7](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:7), in _preparse
    self.pluginmanager.load_setuptools_entrypoints("pytest11")
  File "/home/runner/.local/lib/python3.[8](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:8)/site-packages/pluggy/_manager.py", line 287, in load_setuptools_entrypoints
    plugin = ep.load()
  File "/usr/lib/python3.8/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line [9](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:9)91, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line [10](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:10)[14](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:14), in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/assertion/rewrite.py", line [16](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:16)8, in exec_module
    exec(co, module.__dict__)
  File "/home/runner/.local/lib/python3.8/site-packages/dash/__init__.py", line 5, in <module>
    from .dash import Dash, no_update  # noqa: F401,E402
  File "/home/runner/.local/lib/python3.8/site-packages/_pytest/assertion/rewrite.py", line 168, in exec_module
    exec(co, module.__dict__)
  File "/home/runner/.local/lib/python3.8/site-packages/dash/dash.py", line [18](https:/huggingface/transformers/runs/5729180598?check_suite_focus=true#step:6:18), in <module>
    from werkzeug.debug.tbtools import get_current_traceback
ImportError: cannot import name 'get_current_traceback' from 'werkzeug.debug.tbtools' (/home/runner/.local/lib/python3.8/site-packages/werkzeug/debug/tbtools.py)

I did run 
make fixup 
make style
Then I also merged master what am I missing for this last piece.
lmk if I am missing something

ydshieh · 2022-03-29T08:34:05Z

Hi, @ArEnSc

For this sprint, you don't need to test the model, but just to test the docstrings in model files.

You can see a guide here For Python files.

Before you run, you need to

pip install -e ".[dev]"

Let me know if this works for you.

ydshieh · 2022-03-29T08:39:54Z

src/transformers/models/gpt2/modeling_gpt2.py

@@ -61,7 +61,7 @@

 logger = logging.get_logger(__name__)

-_CHECKPOINT_FOR_DOC = "gpt2"
+_CHECKPOINT_FOR_DOC = "distilgpt2"


This should not be changed. "gpt2" is the official checkpoint for GPT2 model, and it is used in the docstring example for GPT2Model and GPT2LMHeadModel.

Okay got it will change back, I had thought they wanted us to use a lower resource version as per this instruction

Using a small model checkpoint instead of a large one: for example, change "facebook/bart-large" to "facebook/bart-base" (and adjust the expected outputs if any)

sgugger · 2022-03-29T11:38:05Z

src/transformers/models/gpt2/modeling_gpt2.py

+ expected_output=[
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ "LABEL_0",
+ ],


This is not a helpful example at all, also, this model seems to be a sequence classification model according to its model card. The loss at 0.0 is very weird especially.

Indeed, it is a text (sequence) classification model.

@ArEnSc Could you try if the following GPT2 token classification model ?

https://huggingface.co/brad1141/gpt2-finetuned-comp2

Ok I will take a look at this and give it a shot

ArEnSc · 2022-03-29T14:09:17Z

Hi, @ArEnSc

For this sprint, you don't need to test the model, but just to test the docstrings in model files.

You can see a guide here For Python files.

Before you run, you need to
pip install -e ".[dev]"
Let me know if this works for you.

Yes I did run the required commands specifically:

python utils/prepare_for_doc_test.py src docs #line1 This command I didn't run because I was specifically working on modeling_gpt2
python utils/prepare_for_doc_test.py src/transformers/utils/doc.py src/transformers/models/gpt2/modeling_gpt2.py

pytest --doctest-modules src/transformers/models/gpt2/modeling_gpt2.py -sv --doctest-continue-on-failure # I ran this to run
 the test
 
 python utils/prepare_for_doc_test.py src docs --remove_new_line # ran this line to get everything back to normal

I am unsure about how to stop CI from running the add model like runner I suppose as that error came from CI
Thanks let me know!

ydshieh · 2022-03-29T14:21:40Z

@ArEnSc

For now, You can ignore the errors on build_pr_documentation and Add new model like template tests from the CI.
We are currently working on these issues internally.

Then used a token classification model over a sequence model for an example.

ArEnSc · 2022-03-30T21:15:45Z

@ydshieh @sgugger I think this should address the comments =)

ydshieh

Great job & well done :-)
Thank you, @ArEnSc

ydshieh · 2022-03-31T17:00:17Z

src/transformers/models/gpt2/modeling_gpt2.py

+ "Lead",
+ "Lead",
+ "Lead",
+ ],


Hi @sgugger

Patrick delegates the responsibility to me. I am still wondering if you have any extra comment for this PR though.

There are only 2 checkpoints on the Hub for GPT2 + token classification.
This one is trained on writing document evaluation dataset. The output for this example is therefore not really meaningful. However, I am in favor of merge as it is.

@sgugger just following up on this. are we gonna move on this? and close the issue? Thanks =)

Sorry I missed my input was required here. Thanks for the pin @ArEnSc ! Fine by me but we should deactivate formatting for that one line to avoid wasting all that vertical space (and have the list back on one line). You can do so with a comment

# fmt: off

before and

# fmt: on

after.

@sgugger will do!

sgugger

LGTM! Thanks for your PR!

sgugger · 2022-04-11T11:22:56Z

src/transformers/models/gpt2/modeling_gpt2.py

+ "Lead",
+ "Lead",
+ "Lead",
+ ],


Sorry I missed my input was required here. Thanks for the pin @ArEnSc ! Fine by me but we should deactivate formatting for that one line to avoid wasting all that vertical space (and have the list back on one line). You can do so with a comment

# fmt: off

before and

# fmt: on

after.

ydshieh · 2022-04-11T20:03:44Z

Hi, @ArEnSc Ping me for the merge once you finish the # fmt: off thing mentioned by Sylvain :-)

ArEnSc · 2022-04-12T03:53:44Z

@ydshieh this one is good to go now! =)

ydshieh · 2022-04-12T10:12:14Z

@ArEnSc

Hopefully ignores the formatting issue.

--> not just a hope, dream comes True now :-)

Thank you again for the contribution. Merged!

* First Pass All Tests Pass * WIP * Adding file to documentation tests * Change the base model for the example in the doc test. * Fix Code Styling by running make fixup * Called Style * Reverted to gpt2 model rather than distill gpt2 Then used a token classification model over a sequence model for an example. * Fix Styling Issue * Hopefully ignores the formatting issue. Co-authored-by: ArEnSc <[email protected]>

mikechung added 3 commits March 27, 2022 20:56

First Pass All Tests Pass

bf65562

WIP

7186ce2

Adding file to documentation tests

b05a702

mikechung added 2 commits March 27, 2022 22:05

Change the base model for the example in the doc test.

229543d

Fix Code Styling by running

8c4133d

make fixup

ArEnSc changed the title ~~[WIP] - Add Doc Test GPT-2~~ Add Doc Test GPT-2 Mar 28, 2022

ydshieh self-requested a review March 28, 2022 15:35

mikechung added 2 commits March 28, 2022 19:16

Called Style

84491b7

Merge branch 'master' into add-doc-test-gpt2

f472900

ydshieh reviewed Mar 29, 2022

View reviewed changes

sgugger reviewed Mar 29, 2022

View reviewed changes

Reverted to gpt2 model rather than distill gpt2

2774ae7

Then used a token classification model over a sequence model for an example.

ydshieh approved these changes Mar 31, 2022

View reviewed changes

ydshieh reviewed Mar 31, 2022

View reviewed changes

sgugger approved these changes Apr 11, 2022

View reviewed changes

ArEnSc added 2 commits April 11, 2022 21:38

Fix Styling Issue

1ec98a3

Hopefully ignores the formatting issue.

423e039

ydshieh merged commit 924484e into huggingface:main Apr 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Doc Test GPT-2 #16439

Add Doc Test GPT-2 #16439

ArEnSc commented Mar 28, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 28, 2022 •

edited

Loading

ArEnSc commented Mar 28, 2022

sgugger commented Mar 28, 2022

ydshieh commented Mar 28, 2022

ArEnSc commented Mar 29, 2022 •

edited

Loading

ydshieh commented Mar 29, 2022

ydshieh Mar 29, 2022

ArEnSc Mar 29, 2022

sgugger Mar 29, 2022

ydshieh Mar 29, 2022

ArEnSc Mar 29, 2022

ArEnSc commented Mar 29, 2022

ydshieh commented Mar 29, 2022

ArEnSc commented Mar 30, 2022

ydshieh left a comment

ydshieh Mar 31, 2022 •

edited

Loading

ArEnSc Apr 8, 2022

sgugger Apr 11, 2022

ArEnSc Apr 11, 2022

sgugger left a comment

sgugger Apr 11, 2022

ydshieh commented Apr 11, 2022

ArEnSc commented Apr 12, 2022

ydshieh commented Apr 12, 2022

Add Doc Test GPT-2 #16439

Add Doc Test GPT-2 #16439

Conversation

ArEnSc commented Mar 28, 2022 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Mar 28, 2022 • edited Loading

ArEnSc commented Mar 28, 2022

sgugger commented Mar 28, 2022

ydshieh commented Mar 28, 2022

ArEnSc commented Mar 29, 2022 • edited Loading

ydshieh commented Mar 29, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArEnSc commented Mar 29, 2022

ydshieh commented Mar 29, 2022

ArEnSc commented Mar 30, 2022

ydshieh left a comment

Choose a reason for hiding this comment

ydshieh Mar 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ydshieh commented Apr 11, 2022

ArEnSc commented Apr 12, 2022

ydshieh commented Apr 12, 2022

ArEnSc commented Mar 28, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 28, 2022 •

edited

Loading

ArEnSc commented Mar 29, 2022 •

edited

Loading

ydshieh Mar 31, 2022 •

edited

Loading