Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fragments are not reused in query plans creating large request sizes to subgraphs #1894

Closed
smyrick opened this issue May 26, 2022 · 3 comments · Fixed by #1911
Closed

Fragments are not reused in query plans creating large request sizes to subgraphs #1894

smyrick opened this issue May 26, 2022 · 3 comments · Fixed by #1911
Assignees

Comments

@smyrick
Copy link
Member

smyrick commented May 26, 2022

As part of generating a query plan, fragment selections are not used and expanded so that all the locations where previously a fragment selection was defined, now contains all the fields in that fragment.

This means that if you have repeated and/or nested fragments your operations sent to the gateway can be of a decent size because of the reuse of fragments, but the generated query plans and operations sent to subgraphs are multipled in size. This can lead to the query plan cache using lots more memory and subgraphs having to parse, validate, and cache their operations that are much larger than what the client specified.

Example

Client sends an operation that has fragments reused multiple times. If this fragment FooSelect is really long, say 100 lines, this selection is redeclared every time so the query plan would be ~n*100 in size.

query ExampleQuery {
  a {
    ...on A1 {
      ...FooSelect
    }
    ...on A2 {
      ...FooSelect
    }
    ...on A3 {
      ...FooSelect
    }
    ...on A4 {
      ...FooSelect
    }
    ...on A5 {
      ...FooSelect
    }
  }
}

You can view a repo and runtime behavior here: https://stackblitz.com/edit/apolloquery-plan-does-not-reuse-fragments?file=README.md


This is especially noticeable when you have an operation that only does one fetch to a single subgraph and the client sends its operation to the gateway, but then subgraphs are seeing a much larger and different operation, when in theory the gateway could just forward the operation as is. This enhancement is separate from this issue and is discussed over here: #660

The harder problem to solve is going to be how to solve creating new fragment selections to send to subgraphs when the fragments span across different schemas, the query plan would have to take some time to create new fragments. It could even do this for operations that don't declare fragments to start but adds them for the query plans to reduce the request made to subgraphs. Because it would be changing the operation, we might consider this being an opt-in feature or something that users can trigger based on certain params of an operation

@smyrick
Copy link
Member Author

smyrick commented May 26, 2022

There was a feature in gateway/query-planner 0.x called experimental_autoFragmentization which had some ways of reducing this issue: apollographql/apollo-server#3791

This was removed though as part of gateway 2.0 and moving the query planner out to its own package: 449e53f#diff-1a556fe2fd6a545a37d601c8185534fec07e931e2599b32d71c767cd149df20eL1074-L1078

@pcmanus
Copy link
Contributor

pcmanus commented Jun 9, 2022

I originally added code in fed2 so that user fragments can be reused in subgraph fetches when it makes sense, but I never tested it properly and it unfortunately isn't triggering properly at the moment (there is an existing unit test, but it isn't testing the feature end-to-end and so missed the issue). I created #1911 to fix that (and in all fairness, even once the code triggers, there was a few other issues I had to fix to make it work properly in practice).

I essentially added the reproduction of this ticket as a unit test to validated that the fragments were used now.

There was a feature in gateway/query-planner 0.x called experimental_autoFragmentization

Right. Not re-implementing this in fed 2 was somewhat intentional (the fed2 query planner is a full rewrite, so this isn't so much that it was removed than the fact it wasn't re-implemented in the new code). The issue with that option was that while it was potentially helping a lot with queries with tons of repetition (like the repro of this issue, but let's agree that it's not really realistic example), it was also making things worst in plenty of other cases (making the query bigger by introducing fragments that are either not the proper ones, or used only once and don't save anything, but also making subgraph queries less readable in the process). And because it was a query planner option, you couldn't really easily enable it for the query it helps but not for the ones it hurts.

So anyway, the hope here was that by reusing the original query fragments, we should avoid query size explosion but without really ever hurting (we're reusing the user fragments, so that shouldn't readability, and we only reuse them if they are used more than once in the fetch, avoiding to generally make things bigger), and so this can be the default behaviour. Of course, it's much better if the code actually works ...

pcmanus pushed a commit to pcmanus/federation that referenced this issue Jul 21, 2022
The query planner has code originally meant to reuse fragments from the
user query in subgraph fetches when it made sense. Unfortunately that
code wasn't triggered at all due to the query planner not properly
capturing those fragments. This ticket fixes this issue, adds more
testing and improves a few related issues.

Fixes apollographql#1894
benweatherman pushed a commit that referenced this issue Jul 21, 2022
The query planner has code originally meant to reuse fragments from the
user query in subgraph fetches when it made sense. Unfortunately that
code wasn't triggered at all due to the query planner not properly
capturing those fragments. This ticket fixes this issue, adds more
testing and improves a few related issues.

Fixes #1894
@smyrick
Copy link
Member Author

smyrick commented Aug 16, 2022

For those tracking this issue there was an update in this PR too: #1994

You can expect the changes to be in the 2.1.0 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants