Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use message history instead of event payload for conversation handler #2047

Merged
merged 15 commits into from
Sep 26, 2024

Conversation

sobolk
Copy link
Member

@sobolk sobolk commented Sep 24, 2024

Problem

Conversation handler receives messages in the event. The problem is that event size is constrained to 256 KB for asynchronous invocations. https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html .

When conversation is long enough and/or contains images this limit is hit and lambda can't be called.

Previous logic as defined in upstream resolver can be seen here. Note that it's using outdated gql schema.

Changes

  1. Retrieve message history by:
    1. Querying Appsync by conversationId and currentMessageId to recreate history.
    2. Reconciling the history to pair user-asistant messages and drop unanswered questions.
    3. Inject aiContext.
  2. Keep event.messages for now to be backwards compatible until upstream components absorb this change. This will be removed later.
  3. Tests.
  4. Extract GraphqlRequestExecutor in the ai-constructs since we've hit the rule of 3 and refactor other components that execute GQL queries to use it.

Validation

Added tests, including e2e tests.

Manual tests via snapshot release with upstream components.

Checklist

  • If this PR includes a functional change to the runtime behavior of the code, I have added or updated automated test coverage for this change.
  • If this PR requires a change to the Project Architecture README, I have included that update in this PR.
  • If this PR requires a docs update, I have linked to that docs PR above.
  • If this PR modifies E2E tests, makes changes to resource provisioning, or makes SDK calls, I have run the PR checks with the run-e2e label set.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sobolk sobolk added the run-e2e Label that will include e2e tests in PR checks workflow label Sep 24, 2024
Copy link

changeset-bot bot commented Sep 24, 2024

🦋 Changeset detected

Latest commit: a220cee

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@aws-amplify/ai-constructs Minor
@aws-amplify/backend-ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@sobolk sobolk changed the title Use message history instead of event payload for conversational route Use message history instead of event payload for conversation handler Sep 24, 2024
@sobolk sobolk marked this pull request as ready for review September 26, 2024 14:09
@sobolk sobolk requested review from a team as code owners September 26, 2024 14:09
Comment on lines +143 to +148
return messages.reduce((acc, current) => {
// Bedrock expects that message history is user->assistant->user->assistant->... and so on.
// The chronological order doesn't assure this ordering if there were any concurrent messages sent.
// Therefore, conversation is ordered by user's messages only and corresponding assistant messages are inserted
// into right place regardless of their createdAt value.
// This algorithm assumes that GQL query returns messages sorted by createdAt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this better be suited in the AppSync query?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's apparently not easy with NoSQL like DDB. It would require custom resolver to pretty much encapsulate same logic. @atierian is that correct?

Some other considerations:

  1. Number of messages per conversatonId is expected to be rather small set (which we cap at 1000 items, might be configurable in the future).
  2. We have to have all these messages in memory anyways to send it to Bedrock.
  3. The DDB and query provide linear view if history audit by createdAt. The special ordering and filtering is for Bedrock API requirements. Appsync doesn't know about Bedrock calls, Lambda knows, so this adaptation of data seem to belong to Lambda.
  4. This isn't one-way door. I.e. we can move ordering back to special list query and it's resolver later and take this out. Without breaking (multiple application of this algorithm is idempotent).

Copy link
Member

@atierian atierian Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to do this logic in the AppSync resolver function, however the downsides outweigh the benefits.

For context, there's a single GraphQL query per conversation route for listing messages (listConversationMessage<RouteName>). The list query is used by clients and now the Lambda function.

The logic for ordering and omitting in-flight user messages added here is specific to Bedrock's API requirements, not intrinsic to querying message history. We don't want to omit in-flight / currently unanswered user messages from responses from that list query because it could lead to confusing behavior in client applications.

I see a couple options:

  1. Bedrock specific history logic lives in lambda (these changes).
  2. We introduce a flag in the list query for bimodal behavior.
  3. We introduce a separate GraphQL query for outputting messages in this structure.

Of those, 1. feels like the right choice for separation of responsibilities and maintenance.

Comment on lines +60 to +63
/**
* @deprecated This field is going to be removed in upcoming releases.
*/
messages?: Array<ConversationMessage>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not remove it right now since this is in preview anyways?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going back and forth on this one and ultimately decided to keep backwards compatibility.

Factors that influenced this direction:

  1. This is a change in protocol that requires upstream sender to adapt as well. It would disturb "custom handler scenario" in preview until upstream ships.
  2. The alternative was to keep these changes on feature branch. But this seems to be the only change for GA that disturbs the protocol, and maintaining compatibility seems to be less effort than long running branch.
  3. It's a drill of how we'd do this post GA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But. thanks to preview we'll be able to take this out the moment upstream ships.

Comment on lines +64 to +70
messageHistoryQuery: {
getQueryName: string;
getQueryInputTypeName: string;
listQueryName: string;
listQueryInputTypeName: string;
listQueryLimit?: number;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding, these values are populated by AppSync, and are these constants or dynamic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these values are populated by AppSync

Yes.

are these constants or dynamic?

These are dynamic. There might be multiple "conversational routes" and associated models in AppSync. AppSync provides these names so that we do query correct route.

@sobolk sobolk merged commit d0a90b1 into main Sep 26, 2024
60 of 62 checks passed
@sobolk sobolk deleted the conversation-history branch September 26, 2024 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-e2e Label that will include e2e tests in PR checks workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants