Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: [ci] use XCode 14.1 on macOS-latest gcc jobs (fixes #5589) #5588

Closed
wants to merge 19 commits into from

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Nov 15, 2022

Fixes #5589.

How this helps with #5587

Installs https://www.html-tidy.org/ in the (debian, r-devel, clang) CI job, to resolve an R CMD check note. I think this also makes that job more closely match the environment CRAN will use for that check, so I've proposed a similar change upstream: r-hub/rhub-linux-builders#64.

If that r-hub PR is merged at some point, we can revert this one. But for now, this PR unblock's LightGBM's CI.

UPDATE: after this PR was opened, the R-devel debian issue was fixed by an upstream change: #5587 (comment)

How this helps with #5589

Manually switches from XCode 14.0 (the default) to XCode 14.1 on macOS-latest GitHub Actions jobs using gcc, as suggested in https:/orgs/Homebrew/discussions/3659#discussioncomment-3936743.

That can be reverted in the future whenever XCode 14.1+ becomes the default on macOS-latest GitHub Actions jobs.

@jameslamb jameslamb changed the title WIP: [ci] install 'tidy' in debian r-devel clang job (fixes #5587) WIP: [ci] install 'tidy' in debian r-devel clang job (fixes #5587, fixes #5589) Nov 15, 2022
@jameslamb jameslamb changed the title WIP: [ci] install 'tidy' in debian r-devel clang job (fixes #5587, fixes #5589) [ci] install 'tidy' in debian r-devel clang job (fixes #5587, fixes #5589) Nov 16, 2022
@jameslamb jameslamb marked this pull request as ready for review November 16, 2022 05:10
@jameslamb
Copy link
Collaborator Author

Ok I think this is ready for review and will unblock LightGBM's CI.

@jameslamb jameslamb changed the title [ci] install 'tidy' in debian r-devel clang job (fixes #5587, fixes #5589) [ci] us XCode 14.1 on macOS-latest gcc jobs (fixes #5589) Nov 16, 2022
@jameslamb jameslamb changed the title [ci] us XCode 14.1 on macOS-latest gcc jobs (fixes #5589) [ci] use XCode 14.1 on macOS-latest gcc jobs (fixes #5589) Nov 16, 2022
@jameslamb
Copy link
Collaborator Author

A few of the Linux jobs on Azure DevOps (Ubuntu 14.04) are failing with segfaults.

(build link)

../tests/c_api_test/test_.py ..                                          [  0%]
../tests/python_package_test/test_basic.py ............................. [  4%]
.............................................                            [ 10%]
../tests/python_package_test/test_callback.py ............               [ 11%]
../tests/python_package_test/test_consistency.py ......                  [ 12%]
../tests/python_package_test/test_dask.py Fatal Python error: Bus error

...

/__w/1/s/.ci/test.sh: line 177:  1664 Bus error               (core dumped) pytest $BUILD_DIRECTORY/tests
full logs (click me)
../tests/c_api_test/test_.py ..                                          [  0%]
../tests/python_package_test/test_basic.py ............................. [  4%]
.............................................                            [ 10%]
../tests/python_package_test/test_callback.py ............               [ 11%]
../tests/python_package_test/test_consistency.py ......                  [ 12%]
../tests/python_package_test/test_dask.py Fatal Python error: Bus error

Thread 0x00007f4a52ffd700 (most recent call first):
  File "/opt/miniforge/envs/test-env/lib/python3.8/multiprocessing/popen_fork.py", line 27 in poll
  File "/opt/miniforge/envs/test-env/lib/python3.8/multiprocessing/popen_fork.py", line 47 in wait
  File "/opt/miniforge/envs/test-env/lib/python3.8/multiprocessing/process.py", line 149 in join
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/distributed/process.py", line 234 in _watch_process
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 870 in run
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 932 in _bootstrap_inner
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007f4a537fe700 (most recent call first):
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 302 in wait
  File "/opt/miniforge/envs/test-env/lib/python3.8/queue.py", line 170 in get
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/distributed/process.py", line 214 in _watch_message_queue
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 870 in run
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 932 in _bootstrap_inner
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007f4a53fff700 (most recent call first):
  File "/opt/miniforge/envs/test-env/lib/python3.8/multiprocessing/popen_fork.py", line 27 in poll
  File "/opt/miniforge/envs/test-env/lib/python3.8/multiprocessing/popen_fork.py", line 47 in wait
  File "/opt/miniforge/envs/test-env/lib/python3.8/multiprocessing/process.py", line 149 in join
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/distributed/process.py", line 234 in _watch_process
  File "/opt/miniforge/envs/test-env/lib/python3.8/threading.py", line 870 in run

  File "/opt/miniforge/envs/test-env/lib/python3.8/inspect.py", line 997 in getsource
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/distributed/client.py", line 2830 in _get_computation_code
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/distributed/client.py", line 2902 in _graph_to_futures
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/distributed/client.py", line 3193 in compute
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/dask.py", line 667 in _train
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/dask.py", line 1051 in _lgb_dask_fit
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/dask.py", line 1171 in fit
  File "/__w/1/s/tests/python_package_test/test_dask.py", line 278 in test_classifier
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/python.py", line 195 in pytest_pyfunc_call
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/python.py", line 1789 in runtest
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 167 in pytest_runtest_call
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 260 in <lambda>
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 339 in from_call
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 259 in call_runtest_hook
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 220 in call_and_report
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 131 in runtestprotocol
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/runner.py", line 112 in pytest_runtest_protocol
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/main.py", line 349 in pytest_runtestloop
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/main.py", line 324 in _main
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/main.py", line 270 in wrap_session
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/main.py", line 317 in pytest_cmdline_main
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_callers.py", line 39 in _multicall
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_manager.py", line 80 in _hookexec
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/pluggy/_hooks.py", line 265 in __call__
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/config/__init__.py", line 167 in main
  File "/opt/miniforge/envs/test-env/lib/python3.8/site-packages/_pytest/config/__init__.py", line 190 in console_main
  File "/opt/miniforge/envs/test-env/bin/pytest", line 10 in <module>
/__w/1/s/.ci/test.sh: line 177:  1664 Bus error               (core dumped) pytest $BUILD_DIRECTORY/tests

I suspect that it's some combination of:

  1. conda is pulling in libgomp=12.2.0
    • libgomp conda-forge/linux-64::libgomp-12.2.0-h65d4601_19 None

  2. LightGBM is linking to it because conda puts its own libraries on the search path first
  3. LightGBM is incompatible with OpenMP 12 and 13

I'll keep experimenting on this branch 😫

@jameslamb
Copy link
Collaborator Author

Maybe it's significant too that it's only Python 3.8 and 3.9 jobs that are failing, not 3.7 or 3.10.

  • regular (Python 3.9)
  • sdist (Python 3.7)
  • bdist (Python 3.8)
  • inference (Python 3.10)
  • mpi_source (Python 3.8)
  • gpu_source (Python 3.10)
  • swig (Python 3.10)

@jameslamb
Copy link
Collaborator Author

Moving this back to draft while I try to figure out these segfaults in the Dask tests 😫

@jameslamb jameslamb marked this pull request as draft November 17, 2022 04:32
@jameslamb jameslamb changed the title [ci] use XCode 14.1 on macOS-latest gcc jobs (fixes #5589) WIP: [ci] use XCode 14.1 on macOS-latest gcc jobs (fixes #5589) Nov 17, 2022
@jameslamb
Copy link
Collaborator Author

I was able to reproduce the test failures in Docker.

It's our old friend,

OSError: dlopen: cannot load any more object with static TLS

I think we should just stop running the Dask tests on the Linux_* jobs (currently Ubuntu 14.04) util we upgrade to a manylinux image with newer GLIBC in #5580.

I'll push a commit here doing that.

@jameslamb jameslamb mentioned this pull request Nov 18, 2022
60 tasks
@jameslamb
Copy link
Collaborator Author

@jameslamb
Copy link
Collaborator Author

Closing this in favor of #5580.

segfaults on the Ubuntu 14.04 job required fixing the CI image for the Linux * jobs and the macOS jobs together.

@jameslamb jameslamb closed this Nov 21, 2022
@jameslamb jameslamb deleted the ci/resolve-tidy-note branch November 21, 2022 03:41
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https:/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot removed the blocking label Aug 19, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ci] compilation failing on macOS-latest GitHub Actions jobs using gcc
1 participant