Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge .pyi files #926

Merged
merged 9 commits into from
Aug 17, 2021
Merged

Merge .pyi files #926

merged 9 commits into from
Aug 17, 2021

Conversation

chanicpanic
Copy link
Contributor

I started merging the stub files into the code. Here are a couple things I noticed so far:

  • In lexer.py the signatures of some methods of TraditionalLexer and ContextualLexer do not match the signatures in lexer.pyi.
  • lark.pyi and tree.pyi use typing.Literal and typing.Protocol which are new in Python 3.8. The mypy docs suggest using the typing_extensions package, but it may not be installed. How do we want to approach this?

One other question: Would you like me to add type annotations to more functions where possible, or do you only want them on public interfaces?

ast_utils, grammar, and indenter
load_grammar, reconstruct, and visitors
@ThatXliner
Copy link
Contributor

  • lark.pyi and tree.pyi use typing.Literal and typing.Protocol which are new in Python 3.8. The mypy docs suggest using the typing_extensions package, but it may not be installed. How do we want to approach this?

My idea:

I feel like we can do something involving strings and typing.TYPE_CHECKING. Or we can keep those in .pyi files if approach 1 doesn't work.

lark/grammar.py Outdated Show resolved Hide resolved
@erezsh
Copy link
Member

erezsh commented Jun 28, 2021

I'd rather not add dependencies, even if really basic ones. It might affect the standalone parser.

I think we can keep those exceptions in .pyi files for now, until we find a better solution.

lark/grammar.py Outdated Show resolved Hide resolved
lark/grammar.py Outdated Show resolved Hide resolved
@erezsh
Copy link
Member

erezsh commented Jun 28, 2021

As for adding type annotations to internal code, it might make sense in situations where the type isn't obvious. But I prefer if we went through that on a separate PR.

@chanicpanic
Copy link
Contributor Author

As for adding type annotations to internal code, it might make sense in situations where the type isn't obvious. But I prefer if we went through that on a separate PR.

Sounds good. I'll only apply the existing types from the .pyi's in this PR.

exceptions, lark, and tree
@chanicpanic
Copy link
Contributor Author

An idea I had for Literal combining typing_extensions and typing.TYPE_CHECKING:

import sys
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    if sys.version_info >= (3, 8):
        from typing import Literal
    else:
        from typing_extensions import Literal

x: 'Literal["open", "close"]' = "open"

However, the same probably won't work for Protocol because it is a class that is extended.

@erezsh
Copy link
Member

erezsh commented Jun 29, 2021

What if we only support type checking for 3.8 and up? The code will still run with 3.6

@chanicpanic
Copy link
Contributor Author

Type checking need not be limited to 3.8+. mypy lists typing_extensions as a dependency, so the new types should be available on whichever version of python the type checking is being run. The challenge is preventing the interpreter from complaining about undefined names for those who do not use type checking. This case is the reason why the type of x in the example above is provided as a string literal. It prevents python from trying to evaluate it.

Upon further inspection, I don't think PostLex should be a Protocol. Since #859, PostLex is an abstract class and thus provides nominal subtyping. So, I think we can use the approach given above for Literal and fully merge the stub files.

Copy link
Member

@MegaIng MegaIng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. Not all of them are really directly solvable problems, but some are more a TODOs for a potentially future PR. I wouldn't be opposed if you decided to 'fix' none of these.


if TYPE_CHECKING:
from .lark import PostLex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just move the definition of PostLex into common.py. It fits better here anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not lexer.py?

Btw this is an opportunity to change the postlex interface completely, if we wish to.

@@ -23,41 +19,51 @@
except ImportError:
regex = None


###{standalone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erezsh This can happen in another PR, but we might want to consider removing type annotations like we are removing docstrings from the standalone parser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should, if we can.

keep_all_tokens: bool
tree_class: Any
parser: 'Literal["earley", "lalr", "cyk", "auto"]'
lexer: 'Union[Literal["auto", "standard", "contextual", "dynamic", "dynamic_complete"], Type[Lexer]]'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can instead define types inside a version guarded if-statement that when Literal is not available is just Union[str, Type[Lexer]] (and similar for all other types here). Not sure if it is better thou.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the current way. It provides the precision of Literal even when type checking on 3.6 and 3.7.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the current implementation do when typing_extensions is not installed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation executes normally because the import of typing_extensions is wrapped in an if TYPE_CHECKING. I believe it is reasonable to assume that typing_extensions will be installed when TYPE_CHECKING is true because typing_extensions is a dependency of mypy.

@@ -9,12 +10,23 @@
###{standalone
from copy import copy

from types import ModuleType
from typing import (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to have duplicates imports inside of ###{standalone

raw = None
type = None
if TYPE_CHECKING:
from .common import LexerConf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a TODO for later, but I think we can instead of guarding the circle with TYPE_CHECKING move more stuff into common.py. That is why it exists. ( In this case, the definition of TerminalDef/the Patterns/Token)

lark/lexer.py Outdated
"""Lexer interface

Method Signatures:
lex(self, text) -> Iterator[Token]
"""
lex = NotImplemented
lex: Callable[..., Iterator[Token]] = NotImplemented
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are inheriting from ABC, this should just be a normal abstractmethod.

lark/lexer.py Outdated
@@ -452,7 +494,7 @@ def __init__(self, conf, states, always_accept=()):
def make_lexer_state(self, text):
return self.root_lexer.make_lexer_state(text)

def lex(self, lexer_state, parser_state):
def lex(self, lexer_state: Any, parser_state: Any) -> Iterator[Token]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another TODO for potentially later: corrects these annotations

@@ -1072,7 +1082,7 @@ def _unpack_definition(self, tree, mangle):
return name, exp, params, opts


def load_grammar(self, grammar_text: str, grammar_name: str="<?>", mangle: Callable[[str], str]=None) -> None:
def load_grammar(self, grammar_text: str, grammar_name: str="<?>", mangle: Optional[Callable[[str], str]]=None) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am wrong here, but isn't Optional automatically added when =None is the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't know about that feature. However, it looks like its use is no longer recommended. https://stackoverflow.com/questions/62732402/can-i-omit-optional-if-i-set-default-to-none

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha. I didn't see that it was deprecated. Good to know.

@chanicpanic chanicpanic marked this pull request as ready for review June 30, 2021 20:01
lark/common.py Outdated
postlex: 'PostLex' = None
callbacks: Optional[Dict[str, _Callback]] = None
postlex: 'Optional[PostLex]' = None
callbacks: Dict[str, _Callback] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to provide values when annotation on classes?

lark/lexer.py Outdated
raw: str = None
type: str = None
raw: Optional[str] = None
type: Optional[str] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not optional. It is also more of a ClassVar

@erezsh erezsh merged commit 3e2a61a into lark-parser:v1.0 Aug 17, 2021
@erezsh
Copy link
Member

erezsh commented Aug 17, 2021

@chanicpanic Thanks a lot! I'm sure it was hard work. Sorry it took me so long to review.

One question, is __init__(...) -> None really necessary? It seems meaningless, and it can't really be anything else.

@chanicpanic
Copy link
Contributor Author

@erezsh No problem. I'm glad I can help.

Annotating the return type of __init__ is optional as long as at least one of its arguments is annotated. The relevant documentation is at https://mypy.readthedocs.io/en/stable/class_basics.html#annotating-init-methods

@erezsh
Copy link
Member

erezsh commented Aug 17, 2021

I see, thanks for explaining. It's a little surprising, if I already run mypy, why wouldn't I want to scan all the functions by default?

@chanicpanic
Copy link
Contributor Author

I believe mypy behaves this way to make it easier for existing projects to adopt type annotations little by little. Annotations can be applied to the most important or straightforward functions first without receiving a bunch of nasty errors for those that have not been annotated. Also, some functions/variables may be so dynamic that it is hardly worth the trouble to try to declare precise types for them. Thus, you can opt to omit annotations and mypy won't complain.

michael-k added a commit to michael-k/lark that referenced this pull request Sep 23, 2021
The package was removed in PR lark-parser#926.  When trying to
install lark upstream, pip fails:

> error: package directory 'lark-stubs' does not exist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants