Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for clearing file cache on search nodes #7323

Conversation

kotwanikunal
Copy link
Member

Description

  • Introduces a new API to clear the file cache on search capable nodes

Sample Output - POST /_filecache/clear

{
    "_nodes": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "cluster_name": "runTask",
    "nodes": {
        "cp-GjpOoRqedOuLwQCW6AQ": {
            "name": "runTask-0",
            "roles": [
                "cluster_manager",
                "data",
                "search"
            ],
            "cleared": true,
            "item_count": 6227
        }
    }
}

Issues Resolved

Resolves #6030

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #7323 (b0e8077) into main (d984f50) will decrease coverage by 0.10%.
The diff coverage is 100.00%.

❗ Current head b0e8077 differs from pull request most recent head 0664b6b. Consider uploading reports for the commit 0664b6b to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #7323      +/-   ##
============================================
- Coverage     70.71%   70.61%   -0.10%     
+ Complexity    59497    59456      -41     
============================================
  Files          4859     4859              
  Lines        285339   285369      +30     
  Branches      41133    41135       +2     
============================================
- Hits         201772   201517     -255     
- Misses        66965    67303     +338     
+ Partials      16602    16549      -53     
Impacted Files Coverage Δ
.../main/java/org/opensearch/action/ActionModule.java 95.95% <100.00%> (ø)
.../org/opensearch/client/support/AbstractClient.java 31.33% <100.00%> (ø)

... and 460 files with indirect coverage changes

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal
Copy link
Member Author

1: Task failed with an exception.
-----------
* What went wrong:
Execution failed for task ':modules:lang-mustache:test'.
> Process 'Gradle Test Executor 608' finished with non-zero exit value 1
  This problem might be caused by incorrect test process configuration.
  Please refer to the test execution section in the User Manual at https://docs.gradle.org/8.1.1/userguide/java_testing.html#sec:test_execution

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
==============================================================================

2: Task failed with an exception.
-----------
* What went wrong:
Execution failed for task ':modules:ingest-common:test'.
> Process 'Gradle Test Executor 611' finished with non-zero exit value 1
  This problem might be caused by incorrect test process configuration.
  Please refer to the test execution section in the User Manual at https://docs.gradle.org/8.1.1/userguide/java_testing.html#sec:test_execution

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
==============================================================================

Seems to be unrelated.

@kotwanikunal
Copy link
Member Author

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal
Copy link
Member Author

kotwanikunal commented Apr 28, 2023

URL: https://build.ci.opensearch.org/job/gradle-check/14643/
CommitID: 8bd6288

Unrelated failure:

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.action.admin.cluster.stats.ClusterStatsIT.testNodeRolesWithDataNodeLegacySettings" -Dtests.seed=5568C86430BA738A -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ar-LY -Dtests.timezone=Atlantic/Bermuda -Druntime.java=19

org.opensearch.action.admin.cluster.stats.ClusterStatsIT > testNodeRolesWithDataNodeLegacySettings FAILED
    java.lang.IllegalArgumentException: duplicate element: [cluster_manager]
        at __randomizedtesting.SeedInfo.seed([5568C86430BA738A:7C15068B9D7AB34C]:0)
        at java.****/java.util.ImmutableCollections$Set12.<init>(ImmutableCollections.java:791)
        at java.****/java.util.Set.of(Set.java:487)
        at org.opensearch.action.admin.cluster.stats.ClusterStatsIT.testNodeRolesWithDataNodeLegacySettings(ClusterStatsIT.java:415)

#7327

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@andrross
Copy link
Member

Why not add this as a new case to the existing API? e.g. POST _cache/clear?file=true

https://opensearch.org/docs/latest/api-reference/index-apis/clear-index-cache/

@kotwanikunal
Copy link
Member Author

Why not add this as a new case to the existing API? e.g. POST _cache/clear?file=true

https://opensearch.org/docs/latest/api-reference/index-apis/clear-index-cache/

I thought of that. The caveat there is that it operates on a shard level, and we want a node level request.
Also, the existing request checks for write blocks (Ref: Here) which holds true for snapshot backed indices and will need a mutual exclusion from other flags being set to make the request go through, which is also what got me to a new API.

@reta
Copy link
Collaborator

reta commented May 1, 2023

Why not add this as a new case to the existing API? e.g. POST _cache/clear?file=true
https://opensearch.org/docs/latest/api-reference/index-apis/clear-index-cache/

I thought of that. The caveat there is that it operates on a shard level, and we want a node level request. Also, the existing request checks for write blocks (Ref: Here) which holds true for snapshot backed indices and will need a mutual exclusion from other flags being set to make the request go through, which is also what got me to a new API.

+1 to @andrross point, I understand the current limitation (which we could work through) but the new API introduces the inconsistencies with existing _cache APIs (which actually are opaque to the level they operate on, shard/node/cluster/...). Moreover, I believe the files in cache are always associated with the backed index, having the POST /<target>/_cache/clear form would be even more useful.

@kotwanikunal
Copy link
Member Author

kotwanikunal commented May 1, 2023

Why not add this as a new case to the existing API? e.g. POST _cache/clear?file=true
https://opensearch.org/docs/latest/api-reference/index-apis/clear-index-cache/

I thought of that. The caveat there is that it operates on a shard level, and we want a node level request. Also, the existing request checks for write blocks (Ref: Here) which holds true for snapshot backed indices and will need a mutual exclusion from other flags being set to make the request go through, which is also what got me to a new API.

+1 to @andrross point, I understand the current limitation (which we could work through) but the new API introduces the inconsistencies with existing _cache APIs (which actually are opaque to the level they operate on, shard/node/cluster/...). Moreover, I believe the files in cache are always associated with the backed index, having the POST /<target>/_cache/clear form would be even more useful.

I do believe that cache and filecache are separate entities, which is also what prompted me to add in a new API along with the other reasons I mentioned before.
The ideal course of action here, in case we want to reuse the existing endpoint, would be to change the cache APIs to operate at index or shard level to actually represent the action that the end user wants to perform, instead of pruning the whole cache, which is how it is implemented now. (Apart from the changes that we would need to make for the write block checks).

@reta Do you think filecache and cache should still have the same _cache endpoints given the fundamental differences in what they represent and the level they operate on?

@reta
Copy link
Collaborator

reta commented May 2, 2023

@reta Do you think filecache and cache should still have the same _cache endpoints given the fundamental differences in what they represent and the level they operate on?

To me, the _cache API looks universal, it operates over absolutely unrelated caches (or entities) - query cache, fields data cache, file cache ... It does not mean that is the only way - but this is what we have and its important to keep API consistent.

@kotwanikunal
Copy link
Member Author

@reta Do you think filecache and cache should still have the same _cache endpoints given the fundamental differences in what they represent and the level they operate on?

To me, the _cache API looks universal, it operates over absolutely unrelated caches (or entities) - query cache, fields data cache, file cache ... It does not mean that is the only way - but this is what we have and its important to keep API consistent.

I think that clears it up for me. Thanks @reta and @andrross!
I'll update the logic to re-use the existing endpoint.

@kotwanikunal
Copy link
Member Author

Closing in favor of #7498

@kotwanikunal kotwanikunal deleted the file-cache-clear-api branch June 12, 2023 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Searchable Remote Index] Add a REST API for clearing file cache
4 participants