Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(conversation): per message items and lambda history retrieval pattern #2914

Merged
merged 11 commits into from
Oct 8, 2024

Conversation

atierian
Copy link
Member

@atierian atierian commented Sep 27, 2024

Problem Statement

Within conversation routes, the AppSync resolver function invokes the Lambda asynchronously with a payload including the entire conversation history.

The maximum event payload for an asynchronously invoked lambda is 256KB (reference).

This limit can be hit rather quickly in either or both of these conditions:

  1. Conversation is long enough.
  2. Conversation contains media (base64 encoded images).

Description of changes

1. Lambda History Retrieval

AppSync resolver function invokes Lambda function asynchronously with a payload containing only the information necessary (no conversation history) for the Lambda function to retrieve the conversation history from AppSync in a subsequent request.

Current Sequence Diagram

conversation_sequence_diagram

New Sequence Diagram

conversation_sequence_diagram_lambda_history_retrieval

Changes to Lambda Function Payload

{
  // ...
- messages: Array<ConversationMessage>;
+ messageHistoryQuery: { 
+  getQueryName: string;
+  getQueryInputTypeName: string;
+  listQueryName: string;
+  listQueryInputTypeName: string;
+  listQueryLimit?: number;
+ };
  // ...
};

A Note on Eventual Consistency / Read After Write

Reference: DynamoDB read consistency

The listConversationMessage<RouteName> query resolver uses a global secondary index (PK: conversationId, SK: createdAt) to retrieve messages in descending sorted order. GSIs are eventually consistent. This can, in theory, be problematic with this new read-after-write pattern introduced here. The fact that there are a few network hops (1. AppSync -> Lambda 2. Lambda <-> AppSync <-> DDB) between write and read reduce the likelihood of this being an issue. However, it's a real possibility and the end-user experience of missing the current user message is unacceptable.

The Lambda function has logic to handle this case — if the currentMessageId is not included in the list query response, the Lambda function subsequently invokes a getConversationMessage<RouteName> query with the id (ConversationMessages<RouteName> table partition key), which we make a strongly consistent read using the ConsistentRead parameter.

2. Item Level Messages

Current Situation

Conversation routes currently treat messages as user:assistant pairs internally (single DDB item), while exposing them through APIs (e.g. list query) as individual messages. This is bad for several reasons -- the following are the key motivators:

  • when we write an assistant response to DDB, we mutate an existing item.
  • it's already leading to gnarly resolver code to map the paired object into the DTO used by the GraphQL API. This is only going to get worse over time.
  • DynamoDB has an item size limit of 400 KB (reference). We're much less likely to hit that with individual messages than with user:assistant tuples.

The motivation for this approach was to create a GSI that accommodates querying based on the needs of the system ([user, assistant, user, assistant, ...] descending createdAt ordering, discarding user messages without an assistant response).

Changes

1 message (user or assistant) --> 1 item in ConversationMessage<RouteName> DynamoDB table.

This is accomplished by:

  1. ConversationMessage model now includes optional associatedUserMessageId field. This is only populated for assistant messages and follows the paradigm set by hasOne / belongsTo references based relationships.
  2. The assistant response mutation no longer mutates an existing item. Instead it writes a new item to the ConversationMessage<RouteName> table. Because we can no longer use a conditional UpdateItem here, we have to use the same technique for owner auth used by the send message mutation. See the sequence diagrams below for more details:

EXISTING
OLD_chat_assistant_mutation_tuple_message_owner_auth

NEW
chat_assistant_mutation_item_level_message_owner_auth

CDK / CloudFormation Parameters Changed

None

Issue #, if available

Related amplify-backend PR

Description of how you validated changes

Checklist

  • PR description included
  • yarn test passes
  • E2E test run linked
  • Tests are changed or added
  • Relevant documentation is changed or added (and PR referenced)
  • New AWS SDK calls or CloudFormation actions have been added to relevant test and service IAM policies
  • Any CDK or CloudFormation parameter changes are called out explicitly

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@atierian atierian force-pushed the ai.conversation-lambda-retrieval branch from 28c5947 to 019e01f Compare October 1, 2024 17:31
@atierian atierian marked this pull request as ready for review October 2, 2024 15:35
@atierian atierian requested a review from a team as a code owner October 2, 2024 15:35
@atierian atierian force-pushed the ai.conversation-lambda-retrieval branch from 019e01f to a70357d Compare October 2, 2024 17:49
Comment on lines +29 to +26
// TODO: Create and add these values to `ConversationDirectiveConfiguration` in an earlier step and
// access them here.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including this in a follow-up PR that consolidates the pipeline resolver generation logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this means that you want to derive the values closer to the ConversationDirective implementation, not that you're going to allow customers to configure these values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

sundersc
sundersc previously approved these changes Oct 7, 2024
@@ -227,12 +227,12 @@ const constructConversationMessageModel = (
const content = makeField('content', [], makeListType(makeNamedType('ContentBlock')));
const context = makeField('aiContext', [], makeNamedType('AWSJSON'));
const uiComponents = makeField('toolConfiguration', [], makeNamedType('ToolConfiguration'));
const assistantContent = makeField('assistantContent', [], makeListType(makeNamedType('ContentBlock')));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preemptively answering the question "isn't this breaking?"

It's not because this field was never exposed in the MIS. It was strictly an implementation detail used between resolver code and the conversation messages DynamoDB table.

The associatedUserMessageId will be exposed in the MIS (codegen PR pending) as a way to represent user - assistant relationships to consumers.

CONVERSATION_MESSAGE_TYPE_NAME: `ConversationMessage${fieldName}`,
};
Object.entries(substitutions).forEach(([key, value]) => {
const replaced = resolver.replace(new RegExp(`\\[\\[${key}\\]\\]`, 'g'), value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'm calling it: we need a central method for template substitution. Please refactor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in progress in the ai.conversation-consolidate-resolver-generation branch.

I needed prioritize these changes (this and the other PRs from this/last week) and am allergic to mixing functional and (non-trivial) non-functional changes in PRs. I also wanted to make sure whatever abstraction I use here will cover these changes. Thanks for your patience, I know I've tested it with these last few PRs 😅

Comment on lines +29 to +26
// TODO: Create and add these values to `ConversationDirectiveConfiguration` in an earlier step and
// access them here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this means that you want to derive the values closer to the ConversationDirective implementation, not that you're going to allow customers to configure these values?

@@ -0,0 +1,8 @@
export function request(ctx) {
ctx.stash.metadata.index = 'gsi-ConversationMessage.conversationId.createdAt';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care that the GSI name doesn't include the base type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated type ConversationMessage<RouteName>? Nope, or at least I don't. It's fine that the GSIs across multiple routes (DDB tables) have the same name.

@atierian atierian enabled auto-merge (squash) October 8, 2024 22:31
@atierian atierian merged commit 874a30a into main Oct 8, 2024
7 checks passed
@atierian atierian deleted the ai.conversation-lambda-retrieval branch October 8, 2024 22:43
palpatim added a commit that referenced this pull request Oct 10, 2024
* chore: update .jsii assembly

* chore: update .jsii assembly

* chore: migrate pg array objects e2e test in gen2 cdk (#2906)

* chore: graphql prep for test migration

* refactor: generic graphql field selection string with fieldmap

* feat: add postgres array objects e2e test

* test: remove bootstrap in test code

* chore: schema cleanup

* chore: final cleanup

* chore: add explanation on FieldMap ans examples

* chore: remove dup test

---------

Signed-off-by: Kevin Shan <[email protected]>
Co-authored-by: Tim Schmelter <[email protected]>

* fix(model-transformer) IndexName -> index in query list resolver (#2912)

* chore: upgrade cdk library dependency to 2.158.0 (#2876)

* chore: upgrade cdk dependency to 2.158.0

* chore: install and use nvm

* chore: use full version for nvm

* chore: testing linux build with nvm

* chore: fix version in cdk tests

* chore: update jsii files

* update: increase memory size

* add: debug statement

* update: mem size back to 8096, use ps1 file for shell script

* fix: path to Setup-NodeVersion.ps1

* fix: path to codebuild_specs/Setup-NodeVersion.ps1

* add: set runtime version

* update: image

* add: debug statement

* update: use earlier code

* add: debug statements

* update: clean up code

* update: use the correct image

* add: list installed node versions and used nodejs.install

* restart: install nvm using choco

* add: back mem size variable

* add: nvm install and use 18.20.4

* add: env var NVM_HOME and NVM_SYMLINK

* add: spawn powershell as admin

* update: remove all other builds

* add: debug statement

* add: env var path

* update: print env var

* add: commands

* update: env var set up

* add: refresh env var

* update: more debug statement

* update

* revamp: find nvm.exe

* update: install nvm windows directly

* update: launch new shell if current shell does not recognize nvm

* update: install node in buildspec

* add: install and use node in build spec

* update: use single quote to prevent interpreting \

* add: 2 scripts, one for installing nvm, another for using nvm

* fix: path error

* test: which way set env var

* update: set up env var in pre_build

* update: use choco in pre-build

* fix: syntax error

* update: build_windows working, running all tests

* test: remove bootstrap in test code

* debug: _runGqlE2ETests

* update: debug_workflow

* update: debug_workflow

* update: debug_workflow

* update: debug_workflow

* add: debug statement

* add: debug test

* add: debug

* update: use uuid for bucket name

* remove: use of uuid

* add: debug statement

* update: use differrent bucket name

* add: mili second timestamp

* add: debug statement

* remove: debug statement

* remove: redundant code

---------

Co-authored-by: Bobby Yu <[email protected]>
Co-authored-by: Tim Schmelter <[email protected]>

* test: fix gen 1 init (#2924)

* fix(conversation): allow changes to systemPrompt, inferenceConfig, aiModel to be hotswapped (#2923)

* feat: auto increment support (#2883)

* chore(graphql-default-value-transformer): tidy tests

* test(graphql-default-value-transformer): add unit tests for auto increment support

* feat: 🎸 utils to detect Postgres datasource

* feat: 🎸 support auto increment

Implements support for auto increment (serial) fields from Postgres
datasources. Such fields are denoted by an empty `@default` applied to
an `Int` field.

* test(graphql-default-value-transformer): pk can be auto increment

* test(graphql-default-value-transformer): auto-increment crud e2e

* chore: describe test purpose

* chore: removing logging

* chore: describe why invalid cases are invalid

* chore: remove unecessary e2e test case

* chore: test messaging clarity

* chore: type safety

* chore: alphabetize list

* chore: type of return value asserts against string

Co-authored-by: Tim Schmelter <[email protected]>

* chore: test ensures customers can insert to serial fields with custom values

* chore: verify that @default(value) works on mysql

* chore: remove unecessary ssm test case

* chore: update branch from main

* test: value cannot be null on ddb

---------

Co-authored-by: Tim Schmelter <[email protected]>

* fix(conversation): use functionMap for custom handler IFunction reference (#2922)

* fix(generation): gracefully handle stringified tool_use responses (#2919)

* feat(conversation): per message items and lambda history retrieval pattern (#2914)

* fix: sql default value e2e failures (#2932)

* fix(generation): remove trailing comma in inferenceConfig resolver code (#2933)

* fix: add aws_iam to custom operations when enableIamAuthorization is enabled; fix graphql type utils (#2921)

- test: Add additional tests to fix coverage metrics for unchanged files
- test: Add implicit IAM auth support tests
  - Added a skipped test for custom type support, to be re-enabled once we
    figure out the right strategy for this.

* fix: appsync ttl correct duration time unit in ms (#2928)

Signed-off-by: Kevin Shan <[email protected]>

---------

Signed-off-by: Kevin Shan <[email protected]>
Co-authored-by: amplify-data-ci <[email protected]>
Co-authored-by: Kevin Shan <[email protected]>
Co-authored-by: Ian Saultz <[email protected]>
Co-authored-by: Phani Srikar Edupuganti <[email protected]>
Co-authored-by: Bobby Yu <[email protected]>
Co-authored-by: Dane Pilcher <[email protected]>
Co-authored-by: Peter V. <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants