Rework how PreTrainedModel.from_pretrained handles its arguments #866
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Unification of the
from_pretrained
functions belonging to various modules (GPT2PreTrainedModel, OpenAIGPTPreTrainedModel, BertPreTrainedModel) brought changes to the function's argument handling which don't cause any issues within the repository itself (afaik), but have the potential to break a variety of downstream code (eg. my own).In the last release of pytorch_transformers (v0.6.2), the
from_pretrained
functions took in*args
and**kwargs
and passed them directly to the relevant model's constructor (perhaps with some processing along the way). For a typical example, seefrom_pretrained
's signature inmodeling.py
here https:/huggingface/pytorch-transformers/blob/b832d5bb8a6dfc5965015b828e577677eace601e/pytorch_pretrained_bert/modeling.py#L526and the relevant usage of said arguments (after some small modifications) https:/huggingface/pytorch-transformers/blob/b832d5bb8a6dfc5965015b828e577677eace601e/pytorch_pretrained_bert/modeling.py#L600
In the latest release, the function's signature remains unchanged but the
*args
and most of the**kwargs
parameters, in particular pretty much anything not explicitly accessed in [1]https:/huggingface/pytorch-transformers/blob/b33a385091de604afb566155ec03329b84c96926/pytorch_transformers/modeling_utils.py#L354-L358
is ignored. If a key of
kwargs
is shared with the relevant model's configuration file then its value is still used to override said key (see the relevant logic here), but the current architecture breaks, for example, the following pattern which was previously possible.What's more, if these arguments have default values declared in
__init__
then the entire pattern is broken silently: because these default values will never be overwritten via pretrained instantiation. Thus end users might continue running experiments passing different values ofuseful_argument
tofrom_pretrained
, unaware that nothing is actually being changedAs evidenced by issue #833, I'm not the only one whose code was broken. This commit implements behavior which is a compromise between the old and new behaviors. From my docstring:
It would actually be ideal to avoid mixing configuration and model parameters entirely (via some sort of
model_args
parameter for example): however this fix has the advantages ofpytorch-pretrained-bert
erafrom_pretrained.**kwargs
parameter introduced withpytorch-transformers
I have also included various other (smaller) changes in this pull request:
MakingApparently necessary for the tests to pass :(PreTrainedModel.__init__
not accept*args
and**kwargs
parameters which it has no use for and currently ignoresStop using the the "popping from kwargs" antipattern (see [1]). Keyword arguments with default values achieve the same thing more quickly, and are strictly more informative since they linters/autodoc modules can actually make use of them. I've replaced all instances that I could find, if this pattern exists elsewhere it should be removed.Oops: turns out this is a Python 2 compatibility thing. With that said, is there really a need to continue supporting Python 2? Especially with its EOL coming up in just a few months, and especially when it necessitates such ugly code...