Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BENCHMARK] Onboarding OSB to Continuous Builds for Dockerhub Staging and TestPyPi + Automatic Copy-Over #4723

Open
IanHoang opened this issue May 23, 2024 · 17 comments
Labels
cicd enhancement New Enhancement

Comments

@IanHoang
Copy link

IanHoang commented May 23, 2024

Is your feature request related to a problem? Please describe

This issue addresses two points: continuous builds on commits in the OpenSearch Benchmark (OSB) repository and automating copy-overs.

Continuous Builds for OSB Commits: OSB users have expressed interest in using "staging" environments where they can run OSB with latest commits without having to wait for the next official release. Currently, users can git clone the OSB repository and run python3 -m pip install -e . to test the latest commits and changes. However, this can be tedious.

Copy-overs: OSB currently has a Github actions workflow that gets triggered when a tag is pushed by a maintainer. The workflow triggers a Jenkins workflow to release OSB to PyPi and OpenSearch's Dockerhub Staging account. For each release, OSB maintainers have to engage with OpenSearch-Build on-call to copy over the staging account's image over to OpenSearch's Dockerhub Production account and ECR account.

Describe the solution you'd like

Each time a commit is merged into the OSB main branch and latest minor version branch, OSB Dockerhub Staging images for main branch and latest minor version branch can be overwritten to include the latest changes.

OSB Jenkins file should also be updated to include a one-click copy-over so that OpenSearch Build on-call does not need to be engaged to copy over Dockerhub staging images to Dockerhub production.

We are open to other ideas and solutions!

Describe alternatives you've considered

No response

Additional context

No response

@IanHoang IanHoang added enhancement New Enhancement untriaged Issues that have not yet been triaged labels May 23, 2024
@peterzhuamazon
Copy link
Member

peterzhuamazon commented May 23, 2024

As discussed in the meeting there are multiple approaches, I will list a few here:

1. Move to GitHub Actions:

  • Move both pypi build and docker image build step to GitHub Actions
  • Use AWS Secret Manager to hold the credentials, and retrieve only when the workflow runs
  • Each commit will trigger a github action to build pypi, build docker image, publish to staging registry
  • When a github repo tag is cut for an rc, pypi will be published and docker image will be built with the published pypi binary, then push to staging, with automatic copy image from staging to production to follow

2. Use Jenkins for all:

  • As of now, both pypi build/publish and docker image build step are using existing Jenkins pipelines
  • Instead of using github actions, just change the webhook to trigger jenkins on every commit push/merge
  • Make tweaks to the workflow so that pypi will not publish, but only built, then use to build docker staging image and push to staging registry
  • When a github repo tag is cut for an rc, pypi will be published and docker image will be built with the published pypi binary, then push to staging, with automatic copy image from staging to production to follow

Thanks.

@prudhvigodithi
Copy link
Collaborator

Regarding,

OSB Jenkins file should also be updated to include a one-click copy-over so that OpenSearch Build on-call does not need to be engaged to copy over Dockerhub staging images to Dockerhub production.

We can leverage something like https:/opensearch-project/opensearch-k8s-operator/blob/main/jenkins/release.jenkinsfile#L85-L127, today the operator docker images are released in the same way, please check.

@rishabh6788
Copy link
Collaborator

Did a bit of analysis and pushing to testpypi on every pr merge is not as straightforward it seems like, reason being once the package has been published with a specific name and version it cannot be overwritten, see https://peps.python.org/pep-0427/#file-name-convention.

So in case we want to publish on every push we will have to another identifier, like build-id in the setup.py file for it to be able to publish different build-versions of the same release version.
One solution is to add epoch timestamp in the version string but we don't want to do that when publishing to prod.
Open to suggestions. @IanHoang @gaiksaya @peterzhuamazon

@gaiksaya
Copy link
Member

How about on-demand publish to testpypi? Does it have to be per PR merge?
Something like RC1, RC2, etc when we have a set of changes in and need to test it? Something similar to what we follow in distributions releases?
Docker can be per PR merge. So if the user cannot wait till RC1/RC2, they can still use docker images with recent changes?
WDYT @IanHoang?

@rishabh6788
Copy link
Collaborator

The current docker file does a pip install opensearch-benchmark so even if we make it pip install --index-url https://test.pypi.org/simple/ opensearch-benchmark it wouldn't work on each push as pypi will only hold last pushed artifact.
For staging dockerfile we can build it like we used to before by copying code inside the dockerfile and then doing a build and install there.

@IanHoang
Copy link
Author

IanHoang commented Jul 1, 2024

@rishabh6788 @gaiksaya @peterzhuamazon Sorry for the late reply. If docker images are already being continuously built, there might not be a need to have continuous builds to TestPyPi, especially if we're already doing an official minor release on PyPi each month.

We can just focus on implementing the continuous builds (new build on every PR merged into main) for OSB's docker staging account. That way, if users can't wait until the next monthly release on PyPi, they can resort to using OSB's Docker Staging images to test out recent changes like Sayali suggested. Let me know your thoughts?

@gaiksaya
Copy link
Member

gaiksaya commented Jul 5, 2024

Hi @IanHoang ,

I believe @rishabh6788 had pointed out to me earlier (please correct if I am wrong) that even docker images are using pypi artifacts to install opensearch-benchmark on the container. Can you point us to the docker images build code base?
Thanks!

@IanHoang
Copy link
Author

IanHoang commented Jul 9, 2024

@gaiksaya here is the docker image from the OSB code base. We could circumvent this by adding an additional line that git clones the repository from Github and does a pip3 install -e ., which builds OSB in development mode. We could include an additional parameter, that when provided, uses the development version of OSB rather than the PyPi version. Let me know your thoughts?

@gaiksaya
Copy link
Member

gaiksaya commented Jul 9, 2024

That sounds great! Pypi would not be required then. Need to make docker image flexible enough to change the behavior for development and Prod but looks like it is doable.

@peterzhuamazon peterzhuamazon self-assigned this Jul 25, 2024
@peterzhuamazon
Copy link
Member

Hi,

After a quick discussion with @gkamat and @IanHoang, we will try to add these steps to the github actions:

  1. Build opensearch-benchmarks offline
  2. Use the offline built binary to build docker image
  3. Either directly run as part of docker buildx, or later with unit test and integTest of opensearch-benchmark

We can start with the x64 image now, once it works we can expand to multi-arch later.

Thanks.

@peterzhuamazon
Copy link
Member

Seems like docker build is already setup in benchmark repo:
https:/opensearch-project/opensearch-benchmark/blob/main/.github/workflows/docker.yml

@peterzhuamazon
Copy link
Member

According to Ian it is outdated so I will remove it and add the updated ones using build repo.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jul 31, 2024

Since we are not possible to run unittest within docker, we need to directly start docker container at host level.
docker-build-test.log

GitHub Actions will support new arm64 linux server by the end of the year for open source repositories:
https://github.blog/news-insights/product-news/arm64-on-github-actions-powering-faster-more-efficient-build-systems/

We will add the test with x64 runner 1st.
Thanks.

@peterzhuamazon
Copy link
Member

Have talk with @gkamat and he will take over the next step of adding IT.

@peterzhuamazon peterzhuamazon removed their assignment Aug 12, 2024
@zelinh
Copy link
Member

zelinh commented Aug 16, 2024

We could directly publish to TestPypi through GHA once we added authorized publisher on TestPypi. We don't have to use Jenkins for building or releasing purpose.
e.g. https:/opensearch-project/opensearch-py/blob/6382c1570cea3045b37e77ffca955b62f377cc7c/.github/workflows/release-drafter.yml#L60-L61
Seems like for benchmark we haven't onboarded to GHA pypi release though, I remember we have this issue opensearch-project/opensearch-benchmark#451

@peterzhuamazon
Copy link
Member

We could directly publish to TestPypi through GHA once we added authorized publisher on TestPypi. We don't have to use Jenkins for building or releasing purpose. e.g. https:/opensearch-project/opensearch-py/blob/6382c1570cea3045b37e77ffca955b62f377cc7c/.github/workflows/release-drafter.yml#L60-L61 Seems like for benchmark we haven't onboarded to GHA pypi release though, I remember we have this issue opensearch-project/opensearch-benchmark#451

We are not publishing to test pypi anymore as local build wheel is good enough for the test.
Also pushing to pypi requires changing versions all the time.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cicd enhancement New Enhancement
Projects
Status: 🏗 In progress
Status: In Progress
Development

No branches or pull requests

6 participants