Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Query Planner to support Named Fragment generation. #790

Open
sschwenker opened this issue May 31, 2021 · 6 comments
Open

Update Query Planner to support Named Fragment generation. #790

sschwenker opened this issue May 31, 2021 · 6 comments

Comments

@sschwenker
Copy link

We have a few graphs that we have/are building that are pretty complex in the sense that they have more that a few complex types(implementing interfaces or Unions) with joins in between each other. For the most part we've gotten this going quite well accept for some of the limitations around interfaces and unions(#336). My issue I have now is not that I can't get it to work, but that its not very performant.

Take for example that I have 2 services.
Service A

Query {
  vehicles: [Vehicle]
}

interface Vehicle {
  id: ID!
  engine: Engine
}
type Car implements Vehicle @key(fields:"id") {
  id: ID!
  ...
}
type Truck implements Vehicle @key(fields:"id") {
  id: ID!
  ...
}

extend interface Engine @key(fields: "id") {
    id: Int! @external
  }

extend type GasEngine implements Engine @key(fields:"id") {
  id: ID! @external
}

extend type ElictricEngine implements Engine @key(fields:"id") {
  id: ID! @external
}

Service B

interface Engine {
  id: ID
}

type GasEngine implements Engine @key(fields:"id") {
  id: ID!
  ...
}

type ElictricEngine implements Engine @key(fields:"id") {
  id: ID!
  ...
}

Now when I run a simple query to return vehicles

query {
  vehicles {
    id
    engine {
      id
    }
  }
}

The federated graph generates the query with type expansion using inline fragments like the following to service A.

query {
  vehicles {
    __typename
    ... on Car {
      __typename
      id
      engine {
        __typename
        ... on GasEngine {
          __typename
          id
        }
        ... on ElectricEngine {
          __typename
          id
        }
      }
    }
    ... on Truck {
      __typename
      id
      engine {
        __typename
        ... on GasEngine {
          __typename
          id
        }
        ... on ElectricEngine {
          __typename
          id
        }
      }
    }
  }
}

For this example, this doesn't look too bad if you take into consideration that there are 2 vehicle types and 2 engine types which means it's generating type expansion queries with 4(2x2) sub-types, however in our case we have more than 2 layers deep and 30+ sub-types in each so if I run the same calculation with 3 layers and 30 sub-types you get 27,000(30x30x30) instances of type expansion in the query. If you take into consideration that a single type takes approximately 100 bytes to post, then that means to execute this query we will have to post 2.7MB to Service A.

My proposal is to modify the query generator to use Named fragments instead of inline fragments like the following.

fragment EngineFragment on Engine {
  __typename
  ... on GasEngine {
    __typename
    id
  }
  ... on ElectricEngine {
    __typename
    id
  }
}
fragment VehicleFragment on Vehicle {
  __typename
  ... on Car {
    __typename
    id
    engine {
      ...EngineFragment
    }
  }
  ... on Truck {
    __typename
    id
    engine {
      ...EngineFragment
    }
  }
}
query {
  vehicles {
    ...VehicleFragement
  }
}

This would eliminate a large portion of the duplication and shrink the query significantly. I wanted to propose this idea for a few purposes.

  1. Maybe we're doing something wrong with federation that is causing the excessive type expansion.
  2. Pass this idea through the community to see what the interest is and whether this idea is worth following.
  3. Communicate with the Technical leads on this project to determine how best to approach this change.
  4. Gather information on this project as I have not worked in it before.

I hope I have covered all the bases in my explanation, but if you have questions, please feel free to post replies here as I will be monitoring this thread closely. Thank you.

@martijnwalraven
Copy link
Contributor

martijnwalraven commented Jun 10, 2021

@sschwenker There is actually an experimental feature available that does just that. Copied from the changeling entry https:/apollographql/federation/blob/main/gateway-js/CHANGELOG.md#v0132:

NEW: Experimental feature: compress downstream requests via generated fragments #3791 This feature enables the gateway to generate fragments for queries to downstream services in order to minimize bytes over the wire and parse time. This can be enabled via the gateway config by setting experimental_autoFragmentization: true. It is currently disabled by default.

@sschwenker
Copy link
Author

@martijnwalraven , Thank you for pointing that out. I wasn't aware of that feature but we will upgrade our gateway and test enabling that. 🤞

@trevor-scheer
Copy link
Member

Hey @sschwenker, I'd love to hear your thoughts on this. I implemented this feature quite awhile back. At the surface this seems like the gateway could accomplish this reasonably well, but there are some very real difficulties to determining what and when to build fragments effectively. My implementation is quite "dumb" in that it fragments greedily based on a couple thresholds. I wasn't excited about it at the time, and yet it was effective so we shipped it.

All that being said, if you have any ideas about how to improve this feature I'd love to hear them. I'd also be interested to hear the size diff you see in your request after enabling the feature!

@sschwenker
Copy link
Author

Hey @trevor-scheer , We've got this all setup in our demo environment. We had to modify some of our efficiency logic to detect fragments instead of just field, but overall, it seems to be working well. I have notice one issue though. Some people are writing queries with the following

query q {
    myQuery(id: 1) {
        id
        name {
            first
            last
            __typename
        }
    }
}

For some reason when the auto fragmentation sees the typename, it creates a fragment for it even when it doesn't need to because it's not a complex type. This really isn't the issue I have. The issue that I'm having is that it generates a fragment like the following.

query q {
    myQuery(id: 1) {
        id
        name {
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
            ...__QueryPlanFragment_0__
        }
    }
}

fragment __QueryPlanFragment_0__ on Name {
    first
    last
    __typename
}

Everything runs like it should, but it would be nice to cleanup the queries further to save even more space. I was wondering if you had any ideas?

@sschwenker
Copy link
Author

@trevor-scheer . On a side note, if you point me to the relevant code I can review it and see where we can make some improvements? Maybe even the PR that implemented it could help.

@trevor-scheer
Copy link
Member

So the introduction of the fragment by adding __typename is actually just a consequence of adding a 3rd field. There was no hard science to choosing 3 as the number, but I did run some tests against a few real world cases and found this to be the best choice overall.

if (context.autoFragmentization && subGroup.fields.length > 2) {
({ definition, selectionSet } = getInternalFragment(
selectionSet,
returnType,
context,
));
parentGroup.internalFragments.add(definition);
}

The relevant PR that introduced this feature will probably be the most helpful in understanding:
apollographql/apollo-server#3791
(note this previously existed in the apollo-server repo, but the fn names, etc. should all be findable in the current repo)

The repeating of ...__QueryPlanFragment_0__ is definitely a bug! That should never happen. A PR with a reproduction and / or fix would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants