Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support list in response body #2811

Merged

Conversation

mingshl
Copy link
Collaborator

@mingshl mingshl commented Aug 7, 2024

Description

When an array as a string in parameter, the string in list are not escaped.

This PR will escape the string when it's a json array.

for example, before this change:

POST /_plugins/_ml/connectors/_create
{
  "name": "Amazon Bedrock Connector: Claude Instant V1",
  "version": "1",
  "description": "The connector to bedrock Claude model",
  "protocol": "aws_sigv4",
  "parameters": {
    "max_tokens_to_sample": "8000",
    "service_name": "bedrock",
    "temperature": "1.0E-4",
    "response_filter": "$.completion",
    "region": "us-west-2",
    "anthropic_version": "bedrock-2023-05-31",
    "inputs":"please summerize the documents"
  },
  "credential": {
    "access_key": " ",
        "secret_key": " ",
        "session_token": "" },
  "actions": [
    {
      "action_type": "PREDICT",
      "method": "POST",
      "url": "https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-instant-v1/invoke",
      "headers": {
        "x-amz-content-sha256": "required",
        "content-type": "application/json"
      },
      "request_body":  "{\"prompt\":\"${parameters.prompt}\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\n\\nHuman:\"]}"
    }
  ]
}


POST /_plugins/_ml/models/_register
{
  "name": "Bedrock agent model with prompt",
  "function_name": "remote",
  "description": "test model",
  "connector_id": "0CwoD5EB6KAJXDLxYDvd"}
  
POST /_plugins/_ml/models/2SwoD5EB6KAJXDLxezto/_deploy

## success with string
POST /_plugins/_ml/models/2SwoD5EB6KAJXDLxezto/_predict  
{
  "parameters": {
    "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context}. \n\n Human: please summarize the documents \n\n Assistant:",
    "context":"Dr. Eric Goldberg is a fantastic doctor who has correctly diagnosed every issue that my wife and I have had. Unlike many of my past doctors, Dr. Goldberg is very accessible and we have been able to schedule appointments with him and his staff very quickly. We are happy to have him in the neighborhood and look forward to being his patients for many years to come."
  }
}


##failed with list 

POST /_plugins/_ml/models/2SwoD5EB6KAJXDLxezto/_predict  
{
  "parameters": {
    "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context}. \n\n Human: please summarize the documents \n\n Assistant:",
    "context":["Dr. Eric Goldberg is a fantastic doctor who has correctly diagnosed every issue that my wife and I have had. Unlike many of my past doctors, Dr. Goldberg is very accessible and we have been able to schedule appointments with him and his staff very quickly. We are happy to have him in the neighborhood and look forward to being his patients for many years to come."]
  }
}

Related Issues

#2839

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dhrubo-os
Copy link
Collaborator

I assume failing tests are related to this change?

@mingshl
Copy link
Collaborator Author

mingshl commented Aug 8, 2024

match by regex expression is not a good way and it only solves the type casting from jsonstring(list) into a jsonstring(escape list). I will rethink about a better way to do this

@mingshl mingshl force-pushed the main-support-list-in-response-body branch from 86d4598 to 3860c47 Compare August 14, 2024 17:43
@mingshl mingshl force-pushed the main-support-list-in-response-body branch from 3860c47 to fd25bc3 Compare August 14, 2024 17:51
@mingshl
Copy link
Collaborator Author

mingshl commented Aug 14, 2024

came up with a better solution to try fix the invalid payload by adding escape,

in this fix, we don't worry if the parameter is a list or a map, we try help to fix the invalid payload by adding escape,

if adding escape helps, return the payload after adding escape,
else, throw the same illegal argument exception, and indicate the payload is still invalid.

please help review @Zhangxunmt @dhrubo-os

@dhrubo-os
Copy link
Collaborator

Test is failing

Signed-off-by: Mingshi Liu <[email protected]>

try help escape when the payload is not valid

Signed-off-by: Mingshi Liu <[email protected]>
@mingshl mingshl force-pushed the main-support-list-in-response-body branch from fd25bc3 to 83572b5 Compare August 14, 2024 19:47
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 14, 2024 19:48 — with GitHub Actions Inactive
@mingshl
Copy link
Collaborator Author

mingshl commented Aug 14, 2024

Details

  • What went wrong:
    Execution failed for task ':opensearch-ml-plugin:integTest'.

Could not resolve all files for configuration ':opensearch-ml-plugin:opensearch_distro_extracted_testclusters-opensearch-ml-plugin-integTest-0-3.0.0-SNAPSHOT-'.

Could not resolve opensearch-distribution-snapshot:opensearch:3.0.0-SNAPSHOT.

@dhrubo-os any ideas for the linux test failure? I saw the window tests passed all the way through. This failure is not related to the code changes in this PR.

@mingshl mingshl force-pushed the main-support-list-in-response-body branch from 99d3796 to b465bed Compare August 14, 2024 21:57
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 14, 2024 21:57 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env August 14, 2024 21:58 — with GitHub Actions Inactive
@mingshl
Copy link
Collaborator Author

mingshl commented Aug 15, 2024

all tests passed, merging now.

@mingshl mingshl merged commit f64e3f3 into opensearch-project:main Aug 15, 2024
6 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 15, 2024
}

private static String escapeValue(String value) {
return value.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", "\\r").replace("\t", "\\t");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use existing escape method? This escape method will replace "\n" in "This is a \new book" , which is not necessary. It may confuse user why same sentence will generate different embeddings after this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it won't be a break change, because before this change, if it's an invalid payload, it will throw exception directly, it won't generate embedding at all.

if (!isJson(payload)) {
throw new IllegalArgumentException("Invalid payload: " + payload);
String payloadAfterEscape = connectorAction.get().getRequestBody();
Map<String, String> escapedParameters = escapeMapValues(parameters);
Copy link
Collaborator

@ylwu-amzn ylwu-amzn Aug 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If payload is wrong will escape all parameters again. Escape good parameters again may generate wrong result. For example, the original prompt is "please replace all \n of the context with --", if we escape again , it becomes "please replace all \\n of the context with --"

Copy link
Collaborator Author

@mingshl mingshl Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a string is already partial escaped, meaning the string escaped for the next line symbol but doesn't escape for double quote, this change will still not fix the payload. Please let me know if there is better way to identify partial escape cases.

mingshl added a commit that referenced this pull request Aug 19, 2024
(cherry picked from commit f64e3f3)

Co-authored-by: Mingshi Liu <[email protected]>
mingshl added a commit to mingshl/ml-commons that referenced this pull request Aug 31, 2024
mingshl added a commit to mingshl/ml-commons that referenced this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants