-
-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce CI usage #83
Reduce CI usage #83
Conversation
Thanks, I wish I understood the tradeoffs here better. For scriv it won't matter much, because there aren't version-specific concerns, but it's easier for me to use similar patterns across projects. I definitely would like to know more about how to speed up CI runs in general. I'll have to study what you did here to understand. |
Sure thing! My hope is that the individual commits help isolate the changes so they're easier to interpret and reproduce. My goal was simply "reduce total CI usage, and try to keep the workflow duration roughly equivalent if possible". If you'd like to hop on a conference call to discuss the approach and techniques here, just shoot me an email and we can coordinate a good time. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ned, I added commentary to many of the changes here. I hope this is helpful. Let me know if you have questions or suggestions!
.github/workflows/tests.yml
Outdated
python-version: | ||
- "3.7" | ||
- "3.8" | ||
- "3.9" | ||
- "3.10" | ||
- "3.11" | ||
- "pypy-3.7" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section of the matrix is removed because the setup-python
action can now install all of these versions simultaneously, and Tox can take detect and run all of these in one go.
This reduces the matrix down from a cross-product of operating systems and Python versions (18x runs) to merely a list of three operating systems (3x runs).
The advantage here is that slow, duplicate operations that must happen just to get the operating system environment to a functional state now happen once per operating system (3x) instead of once per OS per Python version (18x).
For example, looking at a recent run of the tests against the main
branch, these are steps that had to happen multiple times just to get to a usable state where Tox could run:
- Checkout the repo (13s)
- Set up Python (15s)
- Install dependencies (18s)
- Install pandoc (31s)
Moving these six Python versions from the matrix to the setup-python
action eliminates this time multiplier.
- name: "Restore cache" | ||
id: "restore-cache" | ||
uses: "actions/cache@v3" | ||
with: | ||
path: | | ||
.tox/ | ||
.venv/ | ||
key: "cache-python-${{ steps.setup-python.outputs.python-version }}-os-${{ runner.os }}-hash-${{ hashFiles('tox.ini', 'requirements/*.txt') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By caching the .tox/
and .venv/
directories, the virtual environment and the Tox environments can be restored as filesystem operations, rather than as an installation of Python packages.
The cache key includes a hash of tox.ini
and all of the requirements/*.txt
files changes. This ensures that a cache miss occurs if any of those files changes.
Note that it may be helpful to modify the cache key in the future. In particular, if the last-listed Python version above ("3.11" currently) doesn't change versions but any of the other versions updates (say, from "3.10.x" to "3.10.y"), then:
- This would result in a cache hit.
- However, Tox would detect that the 3.10 version had changed and would rebuild the
py310
test environment. - The cache would not be updated with this new environment, resulting in an increase in test times.
This situation can be improved by adding a step that runs each Python version and writes its exact version to a file. Then, this file could be included in the list of files passed to hashFiles()
.
I haven't learned how to do this yet, so this improvement isn't included. The only thing that will happen in this situation is that Tox has to discard the Python 3.10.x test environment and rebuild it with 3.10.y, so test times will increase until the cache is invalidated by some other means, like when the requirements/*.txt
files are updated.
- name: "Identify venv path" | ||
shell: "bash" | ||
run: "echo 'venv-path=${{ runner.os == 'Windows' && '.venv/Scripts' || '.venv/bin' }}' >> $GITHUB_ENV" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step establishes the relative path to the Python executable and writes it to $GITHUB_ENV
, which expects INI-like lines that establish an environment variable name and its value. Subsequent steps will have access to a venv-path
environment variable.
Although the default shell is specified above to be bash
, note that the forward slashes would also work in Windows' default shell, Powershell, so no changes are needed here.
However, I explicitly included the shell: "bash"
configuration because the test suite failed for me locally with "path not found" errors while running in Powershell, so I'm anticipating the possibility that using bash as the default shell even when running on Windows is masking a failure. Investigating that was not in-scope for this work, but I tried to anticipate a future change to running Powershell on Windows in CI.
- name: "Install dependencies" | ||
if: "steps.restore-cache.outputs.cache-hit == false" | ||
run: | | ||
python -m pip install -U setuptools | ||
python -m pip install -r requirements/tox.txt | ||
python -m venv .venv | ||
${{ env.venv-path }}/python -m pip install -U setuptools | ||
${{ env.venv-path }}/python -m pip install -r requirements/tox.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is expensive, so it only happens when there's a cache miss. A virtual environment is used because it's far easier to cache a virtual environment at a path of our choosing than to try caching items in an OS-specific location.
run: | | ||
python -m tox | ||
${{ env.venv-path }}/python -m tox -m ci-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Tox invocation changes from a bare run of Tox (which was intercepted by tox-gh and transformed to a single run of the default test environment) to use Tox 4's new labels. This allows the tox.ini
file to control which environments the ci-tests
label is associated with.
tox.ini
Outdated
labels = | ||
ci-tests = py{37,38,39,310,311,py3} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the list of supported Python versions changes in envlist
, above, the ci-tests
label should also be updated.
This uses Tox's generative environment syntax.
tox.ini
Outdated
package = wheel | ||
wheel_build_env = build_wheel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, Tox builds a .tar.gz
file once...but when pip installs that file into the virtual environment Tox creates for testing, pip will first convert the file to a wheel. (This can be reviewed by running Tox with -vvv
.)
The package
directive tells Tox to create a wheel instead of a .tar.gz
file, and the wheel_build_env
directive tells Tox that each environment should use a hard-coded build environment name (its default is to use a name that depends on the Python version, iirc).
("build_wheel"
could be any static name, it just happens that I chose "build_wheel"
.)
tox.ini
Outdated
|
||
[gh] | ||
# https://pypi.org/project/tox-gh/ | ||
python = | ||
3.7 = py37 | ||
3.8 = py38 | ||
3.9 = py39 | ||
3.10 = py310 | ||
3.11 = py311 | ||
pypy3 = pypy3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As tox-gh is no longer needed for associating Python versions with Tox environment names, this bit of config is ripped out.
Hi Ned! It looks like this PR has gotten into a stuck state. If you're leaning towards rejecting it, it's fine to close. If you're still interested in learning about what the PR does, let me know; I can be available for a conference call to discuss these changes, and the techniques it's using. And finally, if you'd like me to break this up into some smaller chunks that are easier to digest, let me know. For example, caching virtual environments will be an easy win. |
@kurtmckee I'm coming back to this. I think I will mostly take what you have here, as a learning experience and to use a different pattern than my other projects. You've taught me about tox labels if nothing else! :) |
I rebased onto main in https:/nedbat/scriv/tree/nedbat/rebase-83. One small change: the tox docs suggested using |
Nice! I didn't realize there was a recommendation for the Do you want me to rebase this work on |
I can merge my branch if you like instead of asking you to redo the rebasing. BTW: thanks for the changelog entries, but in my opinion, these sorts of internal developer-oriented details don't need a mention. |
I can rebase, and simultaneously excise the changelog fragments and migrate to |
b69e71c
to
1086eee
Compare
@nedbat I think this addresses the issues you identified. I'll double-check the results of the CI run in the morning. |
Looks good, thanks! |
I was just sitting down to review, but you beat me to it! 😀 |
This PR reduces the total usage in CI (that is, the sum of time across all jobs that run).
Currently, CI usage is ~30 minutes.
Without a cache hit, that number is dropped to ~15 minutes.
With a cache hit, that number drops to ~10 minutes.
The CI duration (the amount of time for the entire workflow to run while you're staring at a wall clock) stays roughly the same (~5 minutes).
This is achieved by reducing the number of test runners to just one of each OS testing all Python versions simultaneously. (The
setup-python
action now supports setting up multiple Python versions!) This reduces the duplication of activities that are unrelated to the actual testing. For example, checking out the repo and installing pandoc.In addition, time is saved by building a wheel once, rather than building a
.tar.gz
that each Python version-specificpip
invocation quietly converts to a wheel in the background.Finally, a virtual environment is created and cached, and the tox environments are cached as well. This reduces the setup time required to actually launch the tests.
Note: The cache of the tox environment may get rebuilt by tox when a Python version has a hotfix bump, like 3.10.x to 3.10.y. A possible improvement to this would be to write the Python versions to a file and include that in the call to
hashFiles()
.