Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate types with better support for nested shapes #146

Closed
wants to merge 4 commits into from

Conversation

baioc
Copy link
Contributor

@baioc baioc commented Jul 22, 2021

Solves #143 by:

  • Flattening nested types generated in the same file/module:
    This way, equal types used in different places only need to be generated once (and the typechecker knows they're the same).

  • Adding type aliases to every single field and type parameter:
    Since type aliases need to be in the toplevel for the typechecker to see them, we need a hierarchical naming scheme.
    The current scheme is not perfect: there could be conflicts if consumers use field names with double underscores.
    Also, accessing the types that parameterize a collection is a bit awkward, since we lack names and resort to numeric indexes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 22, 2021
@baioc
Copy link
Contributor Author

baioc commented Jul 22, 2021

In this first draft, we can use a constructor with a not-that-mangled name.

inner_t = outer_t._nested_t

inner = inner_t(my_field=3)
outer: outer_t = outer_t(nested=inner)

Using that as a type annotation still doesn't work, since we're accessing it from a nested class and that is a known limitation of Pyre.

Another problem is that the additional field which contains the first-class type is added for every field in generated shapes.
This is unnecessary in most cases all cases, since we could use the __annotations__ meta-attribute for that same information.

Using __annotations__ won't help with the annotations, however, as the static type checker requires our inner class definition to be in the global namespace. The solution appears to be changing codegen so as to "flatten" classes in the same generated module.
This is what I'll work on next. Let me know if I'm missing anything.

@vmagro
Copy link
Contributor

vmagro commented Jul 26, 2021

since we're accessing it from a nested class and that is a known limitation of Pyre.

I'm curious, is it because it's being generated with a metaclass? Would it work if we did the codgen for the types inner class in shape.bzl directly? I only implemented it in a metaclass because it was a lot faster (to write) at the time.

Flattening the generated classes all into the module seems to be a good solution (/ the only solution). If we do that can they still be referenced by the nested names, or do they all have to get readable names (set in shape.bzl)?

@baioc
Copy link
Contributor Author

baioc commented Jul 27, 2021

I'm curious, is it because it's being generated with a metaclass?

AFAIK, Pyre simply doesn't check for types aliases that are not in the global namespace.
That said, using the mangled name does work, even if it is in a nested class.

Flattening the generated classes all into the module seems to be a good solution (/ the only solution). If we do that can they still be referenced by the nested names, or do they all have to get readable names (set in shape.bzl)?

Actually, it seems we don't need to flatten the classes themselves if we provide aliases in the global namespace.
Therefore we can keep the "real" type with a mangled name, but add readable aliases in the toplevel of the generated code:

@vmagro
Copy link
Contributor

vmagro commented Jul 27, 2021

Therefore we can keep the "real" type with a mangled name, but add readable aliases in the toplevel of the generated code

That sounds fine, how are you thinking of providing the readable names? Just the field name?

Dealing with generic types (collections) is still a bit awkward, but I don't see other alternatives.
Also updated codegen tests.
@baioc
Copy link
Contributor Author

baioc commented Jul 27, 2021

That sounds fine, how are you thinking of providing the readable names? Just the field name?

As of the latest commit, and since they have to be in the toplevel, the naming scheme for type aliases is:

typeof_<field name>[__<inner field> ... [__<innermost field>]]
      ^--- single underscore             ^--- double underscore

So it is still possible to have conflicting aliases if you're using weird field names with double underscores.

It gets a bit weird with collection types: since there is no field name we can associate to them, we use 0-based indexes.
But this is needed because otherwise there would be no way to annotate those types. So having an outer shape like this:

deep_t = shape.dict(str, shape.list(shape.shape(very_deep=bool)))
inner_t = shape.shape(
    first = deep_t,
    second = deep_t,
)
outer_t = shape.shape(
    nested = inner_t,
)

We can have annotations for the types of every single field, of every single nested shape:
(it would be possible to only generate aliases for non-primitive fields, but I think that would be more awkward)

import outer  # assuming we generated the `outer_t` shape in `outer.py`

outer_t = outer.outer_t
inner_t = outer.typeof_nested
assert outer.typeof_nested__first == outer.typeof_nested__second  # they should be the same
deep_t = outer.typeof_nested__first
deepest_t = outer.typeof_nested__first__1__0  # 1 to get the dict's value type, 0 for the type of the list

deeper: deepest_t = deepest_t(very_deep=True)
deep: deep_t = {'key': (deeper, deeper, deeper)}
inner: inner_t = inner_t(first=deep, second=deep)
outer: outer_t = outer_t(nested=inner)

This runs and typechecks.

@baioc baioc marked this pull request as ready for review July 27, 2021 20:51
@vmagro
Copy link
Contributor

vmagro commented Jul 28, 2021

Seems like you're on the right track, but this is breaking the build now.
Looking at the CI run it seems like buck build //images/appliance:stable-build-appliance is a working repro

@baioc
Copy link
Contributor Author

baioc commented Jul 28, 2021

Oh, I didn't realize python_data needed to be changed as well. Should be fixed now.
Speaking of which, it looks like _python_data does pretty much the same thing as _loader with a bit of extra stuff, we should probably refactor this.

Copy link
Contributor

@vmagro vmagro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this seems pretty reasonable - I'm not in love with the generated code, but digging into it should only be necessary in very narrow circumstances and other than that the public api is the same

I just left one comment of something that I think can be improved, but I'll import and land the PR when that is addressed :)

antlir/bzl/shape.bzl Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@vmagro has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@vmagro
Copy link
Contributor

vmagro commented Dec 10, 2021

c1e4f75 changes the way codegen works to accomplish this same result without the associated difficulty in starlark code

@vmagro vmagro closed this Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants