Skip to content

Commit

Permalink
Add s3 benchmark and deploy pipeline (#66)
Browse files Browse the repository at this point in the history
* rename remove_blobfs_files role
- remove both blobfs and kvfs files

* Rename remove_blobfs role and task name

* rename format_blobfs role

* rename format_blobfs subtasks

* add log dir var

* Refactor for KVFS
- put the format script in conf dir, not bin dir
- put the log file in log dir, not bin dir
- use abs paths for everything

* KVFS enhancements
- stat for dss_formatter and set var
- rename substasks

* rename template and refactor for kvfs

* handle both blobfs and kvfs

* handle legacy and new conf file names

* Only parse log for errors if blobfs

* save dss_target.out to log dir, not bin dir

* update path of dss_target.py.out

* update dss_target.py.out path

* remove reference to unused gcc var for datamover

* don't deploy GCC when deploying client

* remove gcc from datamover

* Don't deploy GCC

* remove gcc defaults

* Stat dss_formatter and install gcc if not present

* Optionally call gcc setenv only if needed

* revise start target
- use abs path of target script
- put target script in conf dir, not bin dir

* fix gcc condition with dss_formatter
- incorrectly used inverse logic

* put minio scripts in conf dir, not bin dir

* add format disks err file

* don't run compaction if dss_formatter is present

* Revise format logic
- blobfs logfile now created from template script
- kvfs format logfile created after format task
- if kvfs format fails - no logfile
- missing logfile used for criteria for future format decision
- this accommodates dss_formatter returns non-0 on failed format

* Cleanup start_dss_host
- put script in /etc/dss not bin dir
- output logs to /var/log/dss
- don't needlessly loop through dss_host.py tasks

* Output logs to /var/log/dss

* execute spdk setup script by abs path
- don't output log if empty

* remove correct paths during format

* execute spdk script directly, not dss_Target.py reset

* don't run compaction if dss_formatter

* remove chdir args

* allow dss_target_config.sh script to be called anywhere
- no chdir needed

* set compaction default to 'no'
- revert compaction string on false condition

* remove target conf dir on uninstall

* Only check blobfs format log if the format file output exists

* Don't check if file exists, we must assume it exists if we are runnign this check

* Add deploy stage to dss-ansible
  • Loading branch information
velomatt authored Mar 25, 2024
1 parent a22703c commit 847688c
Show file tree
Hide file tree
Showing 69 changed files with 1,080 additions and 355 deletions.
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Internal testing inventory files
inv_*
.vscode/**

# VS Code config
.vscode

# JUNIT XML Test Results
*.xml
24 changes: 6 additions & 18 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -1,21 +1,9 @@
variables:
BRANCH_NAME: $CI_COMMIT_BRANCH
SONAR_BRANCH: -Dsonar.branch.name=$CI_COMMIT_BRANCH

image:
name: dss-build_$BRANCH_NAME

workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
variables:
BRANCH_NAME: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
- if: $CI_COMMIT_BRANCH == "master" && $CI_PIPELINE_SOURCE == "push"
- if: $CI_COMMIT_BRANCH =~ /^(stable|feature)\/.*/ && $CI_PIPELINE_SOURCE == "push"
include:
- .gitlab/defaults.yml

stages:
- build
- lint

ansible-lint:
stage: lint
script: ansible-lint *
- deploy
- test
- sync
46 changes: 46 additions & 0 deletions .gitlab/ansible.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
deploy DSS:
stage: deploy
image:
name: $CI_REGISTRY/$ANSIBLE_PROJECT_PATH/$BRANCH_NAME:$DOCKERFILE_NAME
pull_policy: always
environment:
name: $ANSIBLE_INVENTORY
url: $CI_SERVER_URL/dfs/dss/dss-ansible/-/blob/inventory/$ANSIBLE_INVENTORY
before_script:
# Clone ansible repo
- git config --global http.sslVerify false
- git config --global user.name "$CI_USERNAME"
- git config --global user.email "$CI_EMAIL"
- git clone https://$CI_USERNAME:$CI_TOKEN@$CI_SERVER_HOST/$ANSIBLE_PROJECT_PATH.git --branch $ANSIBLE_BRANCH ../dss-ansible
- cd ../dss-ansible
# Get inventory file
- git fetch origin inventory
- git restore --source origin/inventory -- $ANSIBLE_INVENTORY
# Hack to disregard task output from JUNIT callback module
- sed -i -E "s/dump =.+/dump = ''/g" /usr/local/lib/python3.11/site-packages/ansible/plugins/callback/junit.py
script:
- |
ansible-playbook -i $ANSIBLE_INVENTORY playbooks/download_artifacts.yml \
-e "download_artifacts=true" \
-e "artifacts_url=$MINIO_HOST_URL/dss-artifacts" \
-e "artifacts_branch=$BRANCH_NAME"
- ansible-playbook -i $ANSIBLE_INVENTORY playbooks/remove_dss_software.yml
- ansible-playbook -i $ANSIBLE_INVENTORY playbooks/deploy_dss_software.yml
artifacts:
when: always
reports:
junit: "*.xml"
variables:
ANSIBLE_PROJECT_PATH: dfs/dss/dss-ansible
ANSIBLE_BRANCH: MIN-2148-add-deploy-stage
GIT_STRATEGY: none
DOCKERFILE_NAME: rocky8
ANSIBLE_CONFIG: ../dss-ansible/ansible.cfg
ANSIBLE_INVENTORY: inv_$CI_PROJECT_NAME.ini
ANSIBLE_FORCE_COLOR: "true"
JUNIT_OUTPUT_DIR: $CI_PROJECT_DIR
JUNIT_TASK_CLASS: "yes"
JUNIT_INCLUDE_SETUP_TASKS_IN_REPORT: "no"
ANSIBLE_CALLBACK_WHITELIST: junit
rules:
- !reference [.default_rules, merge_and_push]
47 changes: 47 additions & 0 deletions .gitlab/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
build docker:
stage: build
image: docker:25.0.3-git
variables:
ANSIBLE_PROJECT_PATH: dfs/dss/dss-ansible
ANSIBLE_BRANCH: MIN-2148-add-deploy-stage
GIT_STRATEGY: none
DOCKERFILE_NAME: rocky8
DOCKERFILE_PATH: scripts/docker/$DOCKERFILE_NAME.DOCKERFILE
# IMAGE_TAG: $CI_REGISTRY_IMAGE/$BRANCH_NAME:$DOCKERFILE_NAME
IMAGE_TAG: $CI_REGISTRY/$ANSIBLE_PROJECT_PATH/$BRANCH_NAME:$DOCKERFILE_NAME
CACHE_TAG: ${IMAGE_TAG}-cache
before_script:
# Clone dss-ansible repo
- git config --global http.sslVerify false
- git config --global user.name "$CI_USERNAME"
- git config --global user.email "$CI_EMAIL"
- git clone https://$CI_USERNAME:$CI_TOKEN@$CI_SERVER_HOST/$ANSIBLE_PROJECT_PATH.git --branch $ANSIBLE_BRANCH .
# Install certs so buildkit can access Gitlab container registry
- echo "$SSI_ROOTCA_CERT" > /usr/local/share/ca-certificates/SSI-RootCA.crt
- echo "$SSI_ISSUINGCA_CERT" > /usr/local/share/ca-certificates/SSI-ISSUINGCA.crt
- echo "$MSL_ETX_CERT" > /usr/local/share/ca-certificates/msl-etx.samsung.com.crt
- update-ca-certificates --fresh > /dev/null
# Configure buildkitd.toml to use newly-installed certs
- |
cat <<EOF > /buildkitd.toml
[registry."$CI_REGISTRY"]
ca=["/etc/ssl/certs/ca-certificates.crt"]
EOF
# Initialize buildkit with custom config
- docker buildx create --driver=docker-container --name=buildkit-builder --use --config /buildkitd.toml
# Login to Gitlab container registry
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- |
docker buildx build \
--cache-from type=registry,ref=$CACHE_TAG \
--cache-to type=registry,ref=$CACHE_TAG \
--push \
--tag $IMAGE_TAG \
--file $DOCKERFILE_PATH . \
--provenance false
rules:
- !reference [.default_rules, merge_and_push]
- if: '$CI_PIPELINE_SOURCE == "parent_pipeline"'
when: always

10 changes: 10 additions & 0 deletions .gitlab/defaults.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
include:
- project: dfs/dss/dss
ref: master
file: .gitlab/defaults.yml
- project: dfs/dss/dss
ref: master
file: .gitlab/sync-github.yml
- .gitlab/lint.yml
- .gitlab/build.yml
- .gitlab/deploy.yml
27 changes: 27 additions & 0 deletions .gitlab/deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
include: .gitlab/ansible.yml

deploy DSS with upstream dss-sdk artifacts:
extends: deploy DSS
stage: deploy
script:
- |
ansible-playbook -i $ANSIBLE_INVENTORY playbooks/download_artifacts.yml \
-e "download_artifacts=true" \
-e "artifacts_url=$MINIO_HOST_URL/dss-artifacts" \
-e "artifacts_branch=$BRANCH_NAME"
- rm -f artifacts/nkv-target*
- rm -f artifacts/nkv-sdk-bin*
- cp $CI_PROJECT_DIR/df_out/nkv-target-*.tgz artifacts/
- cp $CI_PROJECT_DIR/host_out/nkv-sdk-bin-*.tgz artifacts/
- ansible-playbook -i $ANSIBLE_INVENTORY playbooks/remove_dss_software.yml
- ansible-playbook -i $ANSIBLE_INVENTORY playbooks/deploy_dss_software.yml
- ansible-playbook -i $ANSIBLE_INVENTORY playbooks/test_nkv_test_cli.yml -e nkv_test_cli_test=suite -e nkv_test_cli_suite=suite003
- ansible-playbook -i $ANSIBLE_INVENTORY playbooks/test_s3_benchmark.yml
needs:
- build docker
- project: dfs/dss/dss-sdk
job: build dss-sdk
ref: $UPSTREAM_REF
artifacts: true
rules:
- if: $CI_PIPELINE_SOURCE == "parent_pipeline" && $CI_MERGE_REQUEST_SOURCE_PROJECT_PATH == "dfs/dss/dss-sdk"
8 changes: 8 additions & 0 deletions .gitlab/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ansible-lint:
stage: lint
script: ansible-lint *
needs: []
rules:
- !reference [.default_rules, merge_and_push]
- if: '$CI_PIPELINE_SOURCE == "parent_pipeline"'
when: never
37 changes: 24 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -506,7 +506,7 @@ This playbook will perform the following actions:
* Search for errors in all MinIO logs across all [hosts] / [servers]
* Search for errors in all target logs across all [targets] / [servers]

#### playbooks/deploy_datamover.yml
#### playbooks/deploy_client.yml

Execute this playbook to deploy DSS Client, including datamover, client library, and their dependencies.
Artifacts are deployed to hosts under the [clients] group.
Expand Down Expand Up @@ -539,7 +539,7 @@ This path can be changed by setting the `coredump_dir` var. see: /group_vars/all
Execute this playbook to download artifacts from the dss-artifacts S3 bucket.

By default, this playbook will download artifacts from the public AWS S3 dss-artifacts bucket (public HTTP URL).
The bucket URL can be overridden with the public URL of any S3-compatible bucket (eg: MinIO, DSS).
The bucket can be overridden with the public URL of any S3-compatible bucket (eg: MinIO, DSS)
Additionally, the branch name can also be overridden.

#### playbooks/format_redeploy_dss_software.yml
Expand Down Expand Up @@ -656,7 +656,6 @@ Additional datamover vars:
* datamover_client_lib - Datamover client library
* datamover_logging_path - Path of datamover logs
* datamover_logging_level - Datamover logging level
* datamover_gcc_version - Datamover GCC version
* datamover_index_data_queue_size - Number of entries in datamover index queue
* datamover_awslib_log_debug - Enable or disable AWS lib debugging

Expand Down Expand Up @@ -776,7 +775,7 @@ iperf can be tuned by configuring the following vars (default values shown):

#### playbooks/test_nkv_test_cli.yml

Perform a basic nkv_test_cli test and report observed throughput.
By default, perform a basic nkv_test_cli test and report observed throughput.
This playbook will execute a suite of nkv_test_cli tests in order:

1. Put
Expand All @@ -787,17 +786,29 @@ This playbook will execute a suite of nkv_test_cli tests in order:

Upon test completion, throughput is reported for PUT and GET.

Optionally, this playbook can be used to execute a suite of regression test cases.
This can be done by changing the value of `nkv_test_cli_test` from `smoke` to `suite`.
The default set of test cases can be found in `roles/test_nkv_test_cli/vars/suite001.yml`.
You can create additional test suites using this file as a template.
You can specify your custom test suite by setting `nkv_test_cli_suite` to `your-test` (default: `suite001`).

nkv_test_cli can be tuned by configuring the following vars (default values shown):

| Var name | Default | Description |
| ------------------------------ | ------- | --------------------------------------------------------------------------- |
| nkv_test_cli_keysize | 60 | Key size in bytes. Max size = 255 |
| nkv_test_cli_valsize | 1048576 | Value size in bytes. Max size = 1048576 |
| nkv_test_cli_threads | 128 | Number of threads |
| nkv_test_cli_objects | 2000 | Number of objects for each thread (total objects = objects x threads) |
| nkv_test_cli_vm_objects | 100 | Number of objects if host is a VM (default reduced due to lower throughput) |
| nkv_test_cli_async_timeout | 600 | Async timeout in seconds (increase for larger dataset, or slow throughput) |
| nkv_test_cli_async_retry_delay | 5 | Async retry delay in seconds |
| Var name | Default | Description |
| ------------------------------ | ------------ | ------------------------------------------------------------------------------------- |
| nkv_test_cli_port | 1030 | Port used by nkv_test_cli to communicate with subsystem |
| nkv_test_cli_prefix | meta/ansible | KV prefix used to write object. Must beging with `meta/` |
| nkv_test_cli_keysize | 60 | Key size in bytes. Max size = 255 |
| nkv_test_cli_valsize | 1048576 | Value size in bytes. Max size = 1048576 |
| nkv_test_cli_threads | 128 | Number of threads |
| nkv_test_cli_objects | 2000 | Number of objects for each thread (total objects = objects x threads) |
| nkv_test_cli_async_timeout | 600 | Async timeout in seconds (increase for larger dataset, or slow throughput) |
| nkv_test_cli_async_retry_delay | 5 | Async retry delay in seconds |
| nkv_test_cli_test | smoke | Run standard "smoke" test. Change to "suite" to run regression test suite |
| nkv_test_cli_suite | suite001 | Name of test suite to run. Corresponds to suite vars in roles/test_nkv_test_cli/vars/ |
| nkv_test_cli_integrity | false | Run nkv_test_cli in data integrity mode |
| nkv_test_cli_mixed_io | false | Run nkv_test_cli with "small meta io before doing a big io" |
| nkv_test_cli_simulate_minio | false | Run nkv_test_cli with "IO pattern similar to MinIO" |

#### playbooks/test_ping.yml

Expand Down
13 changes: 9 additions & 4 deletions group_vars/all.yml
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,6 @@
# datamover_operation: DEL
# datamover_operation: TEST
# datamover_dryrun: false
# datamover_compaction: true
# datamover_prefix: ''
# datamover_debug: false
# datamover_data_integrity: true
Expand Down Expand Up @@ -190,10 +189,9 @@
# datamover_client_lib: dss_client
# datamover_logging_path: /var/log/dss
# datamover_logging_level: INFO
# datamover_gcc_version: 5.1
# datamover_index_data_queue_size: 50000
# datamover_awslib_log_debug: 0
# datamover_compaction: "yes"
# datamover_compaction: "no"

### NTP defaults
# ntp_enabled: true
Expand Down Expand Up @@ -225,13 +223,20 @@
# iperf_duration: 10

### nkv_test_cli defaults
# nkv_test_cli_port: 1030
# nkv_test_cli_prefix: meta/ansible
# nkv_test_cli_keysize: 60
# nkv_test_cli_valsize: 1048576
# nkv_test_cli_threads: 128
# nkv_test_cli_objects: 2000
# nkv_test_cli_vm_objects: 100
# nkv_test_cli_async_timeout: 600
# nkv_test_cli_async_retry_delay: 5
# nkv_test_cli_test: smoke
# nkv_test_cli_test: suite
# nkv_test_cli_suite: suite001
# nkv_test_cli_integrity: false
# nkv_test_cli_mixed_io: false
# nkv_test_cli_simulate_minio: false

### EPEL defaults
# skip_epel: false
Expand Down
1 change: 0 additions & 1 deletion playbooks/deploy_client.yml
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,6 @@
hosts: clients
gather_facts: false
roles:
- deploy_gcc
- deploy_aws_sdk_cpp
- deploy_dss_host
- deploy_client_library
Expand Down
1 change: 0 additions & 1 deletion playbooks/deploy_dss_software.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,6 @@
- targets
gather_facts: false
roles:
- deploy_gcc
- deploy_dss_target

# - name: Deploy etcd Gateway
Expand Down
1 change: 0 additions & 1 deletion playbooks/start_datamover.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,6 @@
# * datamover_client_lib - Datamover client library
# * datamover_logging_path - Path of datamover logs
# * datamover_logging_level - Datamover logging level
# * datamover_gcc_version - Datamover GCC version
# * datamover_index_data_queue_size - Number of entries in datamover index queue
# * datamover_awslib_log_debug - Enable or disable AWS lib debugging
#
Expand Down
32 changes: 22 additions & 10 deletions playbooks/test_nkv_test_cli.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
#
# #### playbooks/test_nkv_test_cli.yml
#
# Perform a basic nkv_test_cli test and report observed throughput.
# By default, perform a basic nkv_test_cli test and report observed throughput.
# This playbook will execute a suite of nkv_test_cli tests in order:
#
# 1. Put
Expand All @@ -44,17 +44,29 @@
#
# Upon test completion, throughput is reported for PUT and GET.
#
# Optionally, this playbook can be used to execute a suite of regression test cases.
# This can be done by changing the value of `nkv_test_cli_test` from `smoke` to `suite`.
# The default set of test cases can be found in `roles/test_nkv_test_cli/vars/suite001.yml`.
# You can create additional test suites using this file as a template.
# You can specify your custom test suite by setting `nkv_test_cli_suite` to `your-test` (default: `suite001`).
#
# nkv_test_cli can be tuned by configuring the following vars (default values shown):
#
# | Var name | Default | Description |
# | ------------------------------ | ------- | --------------------------------------------------------------------------- |
# | nkv_test_cli_keysize | 60 | Key size in bytes. Max size = 255 |
# | nkv_test_cli_valsize | 1048576 | Value size in bytes. Max size = 1048576 |
# | nkv_test_cli_threads | 128 | Number of threads |
# | nkv_test_cli_objects | 2000 | Number of objects for each thread (total objects = objects x threads) |
# | nkv_test_cli_vm_objects | 100 | Number of objects if host is a VM (default reduced due to lower throughput) |
# | nkv_test_cli_async_timeout | 600 | Async timeout in seconds (increase for larger dataset, or slow throughput) |
# | nkv_test_cli_async_retry_delay | 5 | Async retry delay in seconds |
# | Var name | Default | Description |
# | ------------------------------ | ------------ | ------------------------------------------------------------------------------------- |
# | nkv_test_cli_port | 1030 | Port used by nkv_test_cli to communicate with subsystem |
# | nkv_test_cli_prefix | meta/ansible | KV prefix used to write object. Must beging with `meta/` |
# | nkv_test_cli_keysize | 60 | Key size in bytes. Max size = 255 |
# | nkv_test_cli_valsize | 1048576 | Value size in bytes. Max size = 1048576 |
# | nkv_test_cli_threads | 128 | Number of threads |
# | nkv_test_cli_objects | 2000 | Number of objects for each thread (total objects = objects x threads) |
# | nkv_test_cli_async_timeout | 600 | Async timeout in seconds (increase for larger dataset, or slow throughput) |
# | nkv_test_cli_async_retry_delay | 5 | Async retry delay in seconds |
# | nkv_test_cli_test | smoke | Run standard "smoke" test. Change to "suite" to run regression test suite |
# | nkv_test_cli_suite | suite001 | Name of test suite to run. Corresponds to suite vars in roles/test_nkv_test_cli/vars/ |
# | nkv_test_cli_integrity | false | Run nkv_test_cli in data integrity mode |
# | nkv_test_cli_mixed_io | false | Run nkv_test_cli with "small meta io before doing a big io" |
# | nkv_test_cli_simulate_minio | false | Run nkv_test_cli with "IO pattern similar to MinIO" |

- name: Validate ansible versions and dependencies
hosts: localhost
Expand Down
1 change: 1 addition & 0 deletions roles/cleanup_dss_minio/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@

### Path defaults
dss_dir: /usr/dss
dss_log_dir: /var/log/dss
nkv_sdk_dir: "{{ dss_dir }}/nkv-sdk"
nkv_sdk_bin_dir: "{{ nkv_sdk_dir }}/bin"
nkv_sdk_conf_dir: "{{ nkv_sdk_dir }}/conf"
Expand Down
Loading

0 comments on commit 847688c

Please sign in to comment.