-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
triple_nested_parallelism fails with KOKKOS_DEBUG and CUDA #1513
Labels
Bug
Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Milestone
Comments
ibaned
added
Bug
Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Blocks Promotion
Overview issue for release-blocking bugs
labels
Apr 2, 2018
This is the same unit test that fails with |
ibaned
changed the title
triple_nested_parallelism unit test
triple_nested_parallelism fails with KOKKOS_DEBUG and CUDA
Apr 2, 2018
ibaned
added a commit
that referenced
this issue
Apr 2, 2018
ibaned
added a commit
to trilinos/Trilinos
that referenced
this issue
Apr 2, 2018
This test requests a hardcoded number of 32 CUDA threads per warp, but with debugging enabled the CUDA kernel uses too many registers and can only run on 16 threads per warp max. [kokkos/kokkos#1514, kokkos/kokkos#1513, #2471]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The following failure mode has been observed on K80 GPUs with KOKKOS_DEBUG enabled:
The text was updated successfully, but these errors were encountered: