Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding User Behavior Insights functionality. #13546

Closed
wants to merge 3 commits into from

Conversation

jzonthemtn
Copy link

Description

Adds the User Behavior Insights (UBI) functionality described in #12084.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jzonthemtn jzonthemtn changed the title #13545 Adding User Behavior Insights functionality. Adding User Behavior Insights functionality. May 4, 2024
Copy link
Contributor

github-actions bot commented May 5, 2024

❌ Gradle check result for bea92e0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jzonthemtn
Copy link
Author

❌ Gradle check result for bea92e0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

I think it is a flaky test outside of UBI changes:

[Test Result](https://build.ci.opensearch.org/job/gradle-check/38340/testReport/) (1 failure / -1)
[org.opensearch.common.cache.stats.DefaultCacheStatsHolderTests.testConcurrentRemoval](https://build.ci.opensearch.org/job/gradle-check/38340/testReport/junit/org.opensearch.common.cache.stats/DefaultCacheStatsHolderTests/testConcurrentRemoval/)

And described by #13220.

Signed-off-by: jzonthemtn <[email protected]>
Copy link
Contributor

github-actions bot commented May 5, 2024

❌ Gradle check result for 2fb835f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jzonthemtn
Copy link
Author

❌ Gradle check result for 2fb835f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Also appears to be a flaky test outside of the code changes here. #1006

@reta
Copy link
Collaborator

reta commented May 6, 2024

@jzonthemtn I haven't look into implementation yet (sorry about that), but it looks to me the plugin should not be part of the core but a separate repository (like most of the non essential plugins out there).


## Indexing Queries

For UBI to index a query, add a `ubi` block to the `ext` in the search request containing a `query_id`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if you do not have a query_id, one will be provided for you.

Is the presence of an empty ubi block sufficient to get the logging?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it generate a query_id if none is provided but I should not have yet. In this first version, a query_id is required. This is because the search response is not yet being modified. In a later revision, query_id will be optional and generated if not provided and returned in the search response's ext. I will remove that code in getQueryId() to make a random UUID if it's null.

An empty block was sufficient, but I will change it to require that the ubi block contains a query_id. If no ubi block, or an empty ubi block, the rest of the code in the UbiActionFilter will be skipped.

client.admin().indices().exists(indicesExistsRequest, new ActionListener<>() {

@Override
public void onResponse(IndicesExistsResponse indicesExistsResponse) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a couple of risks with this:

  1. It doesn't seem to be checking indicesExistsResponse. If the indices do exist, will it issue the CreateIndexRequests again?
  2. The exists() call asynchronously calls the exists API, and then the listener for that call creates the indices. Meanwhile the thread that called indexUbiQuery falls through to line 180 and tries to write to the queries index (possibly before it's been created).

I could be mistaken, but I think the sequence diagram is something like:

sequenceDiagram
    CallingThread ->> OpenSearch: Exists?
    CallingThread ->> OpenSearch: Index query
    OpenSearch ->> ExistsListener: IndicesExistsReponse
    ExistsListener ->> OpenSearch: CreateIndex(UBI_EVENTS_INDEX)
    ExistsListener ->> OpenSearch: CreateIndex(UBI_QUERIES_INDEX)
Loading

(Incidentally, this is my first time using Mermaid to generate a diagram in a GitHub comment -- that's pretty awesome.)

I think you may need to chain the callbacks together, so it's something like:

  1. Calling thread calls indexExists
  2. IndexExists listener:
    1. If both indices exist, call client.index()
    2. Else, create one or both indices, with a callback (can use a shared callback that counts down the number of indices left to create).
      1. In the CreateIndex listener, if decrementing the counter gives you 0, then call client.index().

Copy link
Author

@jzonthemtn jzonthemtn May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. Will update! Thanks for the implementation suggestion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msfroh I thought through this some more and I want to minimize the number of calls the client makes. There's one call to determine if the indexes (ubi_queries and ubi_events) exist. If false, there are two calls to create those indexes. If one of the indexes already exists the creation has no effect. Alternatively, if the indexes don't exist, the client needs to check which index(es) don't exist and that adds two extra calls. What do you think?


@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(queryId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be getQueryId() so that we always materialize the query ID before serializing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is a good idea, especially for when the query_id is optional. Will update.

Copy link
Contributor

github-actions bot commented May 6, 2024

✅ Gradle check result for 00c7c85: SUCCESS

Copy link

codecov bot commented May 6, 2024

Codecov Report

Attention: Patch coverage is 47.08995% with 100 lines in your changes are missing coverage. Please review.

Project coverage is 71.53%. Comparing base (b15cb0c) to head (333e45e).
Report is 271 commits behind head on main.

Files Patch % Lines
.../main/java/org/opensearch/ubi/UbiActionFilter.java 16.00% 60 Missing and 3 partials ⚠️
...src/main/java/org/opensearch/ubi/QueryRequest.java 0.00% 12 Missing ⚠️
...ain/java/org/opensearch/ubi/ext/UbiParameters.java 83.58% 3 Missing and 8 partials ⚠️
...rc/main/java/org/opensearch/ubi/QueryResponse.java 0.00% 8 Missing ⚠️
...rg/opensearch/ubi/ext/UbiParametersExtBuilder.java 75.00% 3 Missing and 2 partials ⚠️
.../main/java/org/opensearch/ubi/UbiModulePlugin.java 85.71% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13546      +/-   ##
============================================
+ Coverage     71.42%   71.53%   +0.11%     
- Complexity    59978    61077    +1099     
============================================
  Files          4985     5058      +73     
  Lines        282275   287313    +5038     
  Branches      40946    41617     +671     
============================================
+ Hits         201603   205518    +3915     
- Misses        63999    64748     +749     
- Partials      16673    17047     +374     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented May 7, 2024

❌ Gradle check result for be5b3a0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented May 7, 2024

❌ Gradle check result for 1d5d871: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented May 7, 2024

❕ Gradle check result for 333e45e: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@andrross
Copy link
Member

andrross commented May 8, 2024

I haven't look into implementation yet (sorry about that), but it looks to me the plugin should not be part of the core but a separate repository (like most of the non essential plugins out there).

This is super important and something we should figure out ASAP as it will determine where this code will live. Have we discussed this yet in any other issue? My inclination is that this feature might benefit from being a separate repository where it can iterate much more quickly and not be so tightly coupled to the OpenSearch repository.

/cc @reta @msfroh

@jzonthemtn
Copy link
Author

I haven't look into implementation yet (sorry about that), but it looks to me the plugin should not be part of the core but a separate repository (like most of the non essential plugins out there).

This is super important and something we should figure out ASAP as it will determine where this code will live. Have we discussed this yet in any other issue? My inclination is that this feature might benefit from being a separate repository where it can iterate much more quickly and not be so tightly coupled to the OpenSearch repository.

/cc @reta @msfroh

Appreciate the nudge on this point. Just some history since not everyone is familiar with UBI, this project started as an external plugin and was then moved to a module (this PR).

@jzonthemtn
Copy link
Author

Apologies for missing the search triage meeting, but I heard it was decided that UBI will be moved to an external plugin project repository so I am going to close this pull request. Thanks to everyone for the comments!

@jzonthemtn jzonthemtn closed this May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants