Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An alternative to typing.ClassVar? #61

Closed
ericvsmith opened this issue Nov 6, 2017 · 12 comments
Closed

An alternative to typing.ClassVar? #61

ericvsmith opened this issue Nov 6, 2017 · 12 comments

Comments

@ericvsmith
Copy link
Owner

In issue #14 we added support for typing.ClassVar annotations. If an annotation is a ClassVar, then it's considered to not be a field (that is, it's not set on instances of the dataclass, it's not in __init__, etc.).

There's ongoing discussion on python-ideas and python-dev about dropping the typing module from the stdlib.

I'm wondering what we should do about ClassVar if typing is in fact dropped from the stdlib.

Currently, the code doesn't import typing. It just looks for "typing" in sys.modules, and if that's present, assumes it's the typing module and looks inside of it for ClassVar. I think this is a good approach. However, if typing is no longer part of the stdlib, I guess it's possible for another module named typing to be used in its place, and then I need to be more defensive about looking inside sys.modules['typing']. Is that case worth worrying about? I sort of think it's not, although it would be easy enough to add a getattr(typing, 'ClassVar') to the code.

The other thing to worry about is: what if typing is removed, but something in the stdlib wants to have a dataclass with a ClassVar? In https://mail.python.org/pipermail/python-dev/2017-November/150176.html, @ncoghlan suggested having dataclasses create its own ClassVar. Another option that's just as good, although the syntax is somewhat worse, is to add a param to field() that says "this isn't really a field". Something like:

@dataclass
class C:
    x: int
    classvar: field(int, not_a_field=True)

In either event, mypy would need to know about it, to know that __init__ for the class would only have one parameter, x.

If we go with any of these approaches, I think we should still keep the code in dataclasses that understands real typing.ClassVar fields. That seems like the most natural way to write code, outside of the stdlib.

@taleinat
Copy link
Contributor

taleinat commented Nov 6, 2017

+1 for dataclasses having its own ClassVar equivalent as a fallback for when typing isn't available, if the decision is made to remove the typing module and it is deprecated.

-1 for field(int, not_a_field=True) or packing this into field in any other way. A class attr is arguably not a field. I think this would be inelegant and confusing, not only the syntax but e.g. not_a_field=True being incompatible with various other possible values of field() parameters.

I can't figure out why one would avoid just importing the typing module ... ?

If someone has a different module importable as typing, I think it is reasonable for that to cause problems, just like with any other module.

@ericvsmith
Copy link
Owner Author

My concern about importing typing is specifically for using dataclasses in the stdlib, where importing it might not be a possibility if typing is removed from the stdlib.

I agree that we can just continue using typing like we do now, without importing it.

I also agree that adding param to field() isn't very appealing, but it came to mind as an alternative. Basically, there must be something about a field that gives the info that its a ClassVar (that is, it's not a field), and the only options are it's annotation (type) and it's default value (as a plain value, or as a field()).

It seems like using its annotation is the right way to go, so the question is: other than using typing.ClassVar, should there be another way of specifying that a member variable isn't a field? If typing stays in the stdlib, then the answer should be "no". But if it is removed, I'm not so sure.

Maybe if typing is removed from the stdlib, then the answer is just: you can't have ClassVars in a dataclass that is defined in the stdlib. That doesn't seem like such a problem to me.

@ncoghlan
Copy link

ncoghlan commented Nov 7, 2017

Aside from the standard library usage case, the other main reason for not wanting to import the typing module is because it takes a long time to import (around 10 ms on my machine). That's a fairly high price to pay just to get access to a spelling that could be handled via a simple class instance, without all of the typing modules metaclass gymnastics.

If you combine it with the is_class_var singledispatch function, and move the typing.ClassVar registration into the typing module, then I think you'd be establishing a good precedent whereby:

  • dataclasses had aliases for stable, well-defined typing concepts that static analysers understood, but did not interact with the runtime type machinery (since they'd produce ordinary instances, not runtime classes)
  • typing paid the runtime price for trying to make objects that "did the right thing" when interacting with the runtime type machinery.

In essense, where we now have typing and typing_extensions, we'd move to a situation where we instead have dataclasses in the standard library and typing as a bundled PyPI module. There'd be a very strict filter on things "graduating" from typing into dataclasses, but there'd also be a well-defined subset of the type annotations where the semantics were considered completely stable (with ClassVar, and potentially Any, being the only initial members of that set).

@taleinat
Copy link
Contributor

taleinat commented Nov 7, 2017

@ncoghlan What is is_class_var? I can't find references to it with a simple search.

@taleinat
Copy link
Contributor

taleinat commented Nov 7, 2017

One of the benefits of being in the stdlib is that we only need to support one version of Python. In 3.7 typing.ClassVar will exist and IMO we should just use it. If typing is removed we can just copy or replace ClassVar in dataclasses, without anything else from the typing module.

For forward-compatibility, dataclasses might want to do from typing import ClassVar (or something equivalent), include ClassVar in its __all__, and suggest (in documentation and examples) using ClassVar imported from dataclasses rather than directly from typing.

@ncoghlan
Copy link

ncoghlan commented Nov 7, 2017

@taleinat typing.ClassVar may not exist in 3.7 (without installing typing from PyPI) - we're currently discussing taking typing out of the standard library because the rate-of-change is still too high.

The is_class_var suggestion comes from my python-dev email that prompted this issue, which is that I think dataclasses should be entirely unaware of the typing module and instead do the following:

class _ClassVar:
    def __init__(self, annotation):
        self.annotation = annotation

class _MakeClassVar:
    def __getitem__(self, key):
        return _ClassVar(key)

ClassVar = _MakeClassVar()

@functools.singledispatch
def is_class_var(annotation):
    return isinstance(annotation, _ClassVar)

Then the typing module can take care of registering its own overload for dataclasses.is_class_var that tells it to treat typing.ClassVar as indicating a class variable as well.

@taleinat
Copy link
Contributor

taleinat commented Nov 7, 2017

@ncoghlan, thanks for the info, I hadn't known that typing was considered for removal so soon. I definitely should have read your email! I like what you suggest, +1.

hynek added a commit to python-attrs/attrs that referenced this issue Nov 7, 2017
@ambv
Copy link

ambv commented Nov 7, 2017

Removal of typing from the standard library is unlikely but we'll see. But the import time is an issue on its own.

As for "dataclasses should be unaware of typing", that sounds reasonable if ClassVar is meant to be used by data classes users anyway.

  1. It's rather simple for typeshed to include the alias from dataclasses.ClassVar to typing.ClassVar so they would be equivalent for typing purposes.
  2. And with the singledispatch approach, typing.ClassVar would register to respond to dataclasses.is_class_var(). I don't like that is_class_var() makes dataclasses.ClassVar a ghetto ABC that requires use of a custom function instead of regular isinstance() and issubclass().

Also note that while this makes importing dataclasses faster, it further slows down importing typing which now will have to attempt to import dataclasses and set up the singledispatch.

@gvanrossum
Copy link

From the OP:

However, if typing is no longer part of the stdlib, I guess it's possible for another module named typing to be used in its place, and then I need to be more defensive about looking inside sys.modules['typing']. Is that case worth worrying about? I sort of think it's not, although it would be easy enough to add a getattr(typing, 'ClassVar') to the code.

If we remove typing from the stdlib (which is less and less likely given the complications and worries) I would put that bit of defensive coding in, just so that if someone does create their own typing module and it doesn't have ClassVar their use of dataclasses (without using ClassVar obviously) will not fail mysteriously. Especially since this is only one tiny bit of code. (I'd add a comment too.) If typing stays in the stdlib I wouldn't worry (in general we almost never worry about users sabotaging the stdlib).

The other thing to worry about is: what if typing is removed, but something in the stdlib wants to have a dataclass with a ClassVar?

I don't like any of the schemes proposed for this in this thread. They all seem way too harebrained, and would require mypy to jump through extra hoops. If typing is removed, then stdlib modules had better not try to use ClassVar. (In fact stdlib modules had better be conservative with using dataclasses for one full release -- dataclasses will be approved provisionally (but without any of the nonsense about emitting FutureWarning every time it's imported).

@ericvsmith
Copy link
Owner Author

I agree that all of the proposed solutions seem sketchy. Why would it be dataclasses that own the singledispatch registry for is_class_var(). I think all of the proposals fall in to this category, of elevating dataclasses's "stature" just because it happens to want a feature that's currently provided by typing. If typing is removed from the stdlib, I'd recommend to just drop the ClassVar feature from dataclasses. You can achieve basically the same thing with a global variable.

@gvanrossum
Copy link

Sounds good. You can close this issue.

@ncoghlan
Copy link

ncoghlan commented Nov 7, 2017

@ericvsmith's suggested approach sounds reasonable to me, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants