-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix requests for large team scratch sizes #4728
Conversation
I think that this fix will work fine. The unit test might not fine for all testing platform when the testing platform does not have enough memory as it request very large shared memory which could be amplified by the n_teams. As the test is just for checking that the large number remains as a large number internally in Kokkos (not insane number with overflow), the test can be limited for |
dfe9a05
to
468046a
Compare
5907e37
to
3aa334a
Compare
Retest this please. |
c411f3c
to
c832213
Compare
Relies on #4798 for OpenMPTarget. |
46fa011
to
18e545c
Compare
e7d072a
to
ac1a116
Compare
ac1a116
to
cef9729
Compare
@@ -895,7 +895,7 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>, | |||
|
|||
const size_t pool_reduce_size = 0; // Never shrinks | |||
const size_t team_reduce_size = TEAM_REDUCE_SIZE * m_policy.team_size(); | |||
const size_t team_shared_size = m_shmem_size + m_policy.scratch_size(1); | |||
const size_t team_shared_size = m_shmem_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the scratch_size(1) removed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_shmem_size
already contains m_policy.scratch_size(1)
,
inline ParallelFor(const FunctorType& arg_functor, const Policy& arg_policy)
: m_instance(t_openmp_instance),
m_functor(arg_functor),
m_policy(arg_policy),
m_shmem_size(arg_policy.scratch_size(0) + arg_policy.scratch_size(1) +
FunctorTeamShmemSize<FunctorType>::value(
arg_functor, arg_policy.team_size())) {}
};
Probably the culprit for #5025 , my guess some bug in the OpenMP scratch memory management which does require larger scratch after all. |
Fixes #4715 by converting a bunch of places to take
size_t
as the type for allocation sizes.