Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enable MongoDB collection sharding #760

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

tschneider-aneo
Copy link
Contributor

@tschneider-aneo tschneider-aneo commented Sep 19, 2024

Motivation

Following aneoconsulting/ArmoniK.Infra#168, this PR aims to enable the deployment of a sharded MongoDB as ArmoniK's database.

A more complete motivation is provided in AEP 004 (aneoconsulting/ArmoniK.Community#48). This PR is meant to treat a previsible bottleneck on ArmoniK's database by enabling a sharded architecture to it.

For now, when a sharded database is specified, the following collections will be sharded :

  • Result
  • TaskData
  • SessionData

Description

This PR adds :

  • A method ShardCollectionAsync to the IMongoDataModelMapping interface.
  • 2 new MongoOptions : a string AuthSource and a boolean Sharding.
  • A new class ShardingExt containing an extension method shardCollection for the IClientSessionHandle interface.

When the MongoOption Sharding is true, the MongoCollectionProvider calls ShardCollectionAsync. Then the implementation depends on whether the collection is wanted to be sharded :

  • If the collection has to be sharded, the shardCollection extension method will be called.
  • If the collection isn't wanted to be sharded, the method directly returns a complete Task.

The new MongoOption AuthSource is required because the MongoClient has to authenticate as an administrator to be able to shard a collection.

Testing

It has been tested manually. Since these developments are very coupled with ArmoniK's database, it is difficult to write unit tests. Anyhow, it is still possible to have automatized integration tests that would verify in a first time if a deployment of ArmoniK using these developments succeeds, and in a second time if sharding is indeed enabled on the right collections. Workflows testing a deployment of ArmoniK with a sharded MongoDB are currently being studied.

Impact

In the end, it is expected to enhance ArmoniK's strong scalability.

Additional Information

To better understand how MongoDB sharding works see MongoDB's documentation.

To understand why this new MongoOption is required, see : MongoDB's documentation on the shardCollection database command and on connection string's authSource option

Checklist

  • My code adheres to the coding and style guidelines of the project.
  • I have performed a self-review of my code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • I have thoroughly tested my modifications and added tests when necessary.
  • Tests pass locally and in the CI.
  • I have assessed the performance impact of my modifications.

@CLAassistant
Copy link

CLAassistant commented Sep 19, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@aneojgurhem aneojgurhem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Need to use the new template for PR now

@aneojgurhem aneojgurhem merged commit 2cb6681 into main Oct 1, 2024
99 checks passed
@aneojgurhem aneojgurhem deleted the ts/shard-mongodb-collections branch October 1, 2024 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants