Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Re-evaluate the parser/generator wrt multiple write-outs. #236

Open
dantswain opened this issue Feb 11, 2017 · 1 comment
Open

RFC: Re-evaluate the parser/generator wrt multiple write-outs. #236

dantswain opened this issue Feb 11, 2017 · 1 comment
Assignees
Milestone

Comments

@dantswain
Copy link
Collaborator

This is a follow-up to #227 (fixed by #230). The current behavior of the parser/generator is to iterate over the list of input files and do a full parse/generate for each. This leads to a fair amount of double-processing and at times writing out generated code more than once. For the most part this isn't a huge problem because the generated code is consistent, but as we saw with Constants it can lead to modules being clobbered.

On one hand, it seems like the right thing to do might be to iterate over the list of input files (and any files that those include) and parse out one unified schema from which to generate code. This should be the most consistent way to generate output. It seems like it would also require fairly significant changes to the code.

On the other hand, I believe the way we are doing it now is close(r) to the way that most of the official thrift generators work: generally one executes thrift -o whateverlanguage path/to/some.thrift which parsers a single input file and generates code from that file. I'm not 100% sure whether that will generate structs/enums/etc from files that are included from the original file. These parsers don't seem to really go out of their way to make sure that developers don't clobber their own generated code due to name collisions. I'm also not sure that they go out of their way to make sure that any code you generate will be complete from a name resolution standpoint (i.e., if you reference Foo, it may not always care if you've processed the file that generates Foo).

An intermediate solution would be to only generate code for the "initial file" in the file group. I.e., if file x.thrift does include "y.thrift" which defines struct Y, then we parse y.thrift to resolve references but do not generate Y except when specifically processing y.thrift.

I guess a first step here would be to do some research on how the existing parsers work, especially wrt how they handle included files - I'm kind of shooting from the cuff based on past experience but I'm not 100% confident.

@jparise
Copy link
Collaborator

jparise commented Aug 6, 2018

I think this is the right direction for the parser and generation. I'll own this for the time being, but I don't expect to work on it until at least the 2.1 milestone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants