Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StokesFO Kokkos Refactor and Boundary Layout Update #654

Merged
merged 25 commits into from
Feb 25, 2021

Conversation

mcarlson801
Copy link
Collaborator

This PR contains updates to the boundary conditions of the StokesFO problem. All sideset evaluators used in StokesFO have been updated to use the new collapsed sidesets layout while still allowing other problems to use these evaluators with the old sideset layouts. Additionally, the remaining performance intensive evaluators used in StokesFO have been ported to Kokkos. A summary of the changes in this PR is as follows:

  • StokesFO sideset evaluators have been updated to allow new or old sideset layouts
  • StokesFO evaluators have been fully ported to Kokkos and the total fill time for GPU builds has been reduced significantly
  • Added a 'localDOFview' structure to PHAL_Workset that is initialized in STKDiscretization.cpp::computeSideSets(). This structure is a kokkos view that is used in GatherVerticallyContractedSolution to index into the underlying Thyra vector in a coalesced fashion on GPU.
  • Added control flow to StokesFOBase so that other problems that extend StokesFOBase use the old sideset layout until they can be updated.
  • Views have been added to DiscretizationUtils so that computeBasisFunctionsSide has access to the appropriate data when running on GPU.

…onCoefficient and GatherVerticallyContractedSolution
…me cleaning and to be extended to Jacobian and HessianVec. Also contains a GPU fix for previous commit
…adWeights initialization to PostRegistrationSetup, replaced quadWeights vector with Kokkos view
…ation back out of postRegistractionSetup. Would need to pass in numLayers during registration but it isn't a performance concern so it is fine as is
…or each workset even if they are empty and unused
… indexing bug in GatherVerticallyContractedSolution. Reverted changes to collapsed sideset layout constructor therefore all sidesets are now of size worksetSize*numSides in the sideSet_idx dimension
Copy link
Collaborator

@jewatkins jewatkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for working on this. I think mostly minor changes are needed. If you could send me StokesFO performance numbers on GPU that would be good to look at before pushing. We will need to discuss the left over evaluators which do not have kokkos kernels to see if we should add them. Could you post a list here and ping Mauro to see what he thinks? Also, it would be good to have that list/discussion for future porting.

…ther Kokkos kernels to ComputeBasisFunctionsSide
…on into postRegistrationSetup in GatherVerticallyContractedSolution, disentangled collapsed and non-collapsed implementations in response evaluators and FieldFrobeniusNorm
Copy link
Collaborator

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monstruous PR! Great work. I had some comments, but I believe they are just thoughts, for future development and/or to spark discussions.

@mperego
Copy link
Collaborator

mperego commented Feb 12, 2021

@mcarlson801 @jewatkins I'm going to start working on some evaluators modified by this PR. Do you suggest that I base my work on this branch?

@jewatkins
Copy link
Collaborator

@mcarlson801 @jewatkins I'm going to start working on some evaluators modified by this PR. Do you suggest that I base my work on this branch?

If they're sideset evaluators, I would say yes. Although if you need things in master soon maybe not. I think there's still a few things Max needs to fix and it would be good to have the GPU performance tests up before merging this. See #661 and trilinos/Trilinos#8727.

@mcarlson801
Copy link
Collaborator Author

@mcarlson801 when I run the testsuit using this branch, a lot of tests fail with this error:

3: p=3: *** Caught standard std::exception of type 'std::logic_error' :
3: 
3:  /ascldap/users/mperego/Workspace/albany/sources/albany-src/src/disc/stk/Albany_STKDiscretization.hpp:210:
3:  
3:  Throw number = 1
3:  
3:  Throw test that evaluated to true: true
3:  
3:  Local DOF view was requested but it has not been created.

Have you seen that?

@mperego It looks like this change wasn't enough to pass unit tests. I'm working on a more comprehensive fix and I'll let you know as soon as this branch is testing clean. Sorry for the inconvenience, I'll try to get this working ASAP.

@mperego
Copy link
Collaborator

mperego commented Feb 16, 2021

@mcarlson801 Thanks. Do you mind if I cherry-pick your last commit onto my version where I just merged the master and force-push? Otherwise I need to do the merging again

@mcarlson801
Copy link
Collaborator Author

@mcarlson801 Thanks. Do you mind if I cherry-pick your last commit onto my version where I just merged the master and force-push? Otherwise I need to do the merging again

@mperego Yeah, if that works best for you, go for it. I'll work around whatever you decide to do, shouldn't be a problem.

@mperego
Copy link
Collaborator

mperego commented Feb 16, 2021

@mcarlson801 It's probably best if I hold off while you are fixing things. I won't mess with this branch

@mperego
Copy link
Collaborator

mperego commented Feb 18, 2021

@mcarlson801 I notice that the StokesFOThermoCoupled problem is still using the non collapsed sidesets. Are you planning to change that in this PR or is it separated by this?
Sorry for keeping pinging you. As I mentioned I'm working at different evaluators that are based on sideset and I'd rather do that only once using the new layout and Kokkos implementation.

@mcarlson801
Copy link
Collaborator Author

@mcarlson801 I notice that the StokesFOThermoCoupled problem is still using the non collapsed sidesets. Are you planning to change that in this PR or is it separated by this?
Sorry for keeping pinging you. As I mentioned I'm working at different evaluators that are based on sideset and I'd rather do that only once using the new layout and Kokkos implementation.

@mperego The only problem that will be updated in this PR is StokesFO. The other problems that inherit from StokesFOBase will be changed in a later PR.

Once I have tested this next (and hopefully final) commit for this PR, I will take a look at the StokesFOThermoCoupled problem to see what is needed to get it working with the collapsed sidesets. It may be that all of the evaluators for StokesFOThermoCoupled are ready for the new layout and in that case I could just switch a flag and it would use the new layouts. I'll update you when I know more.

@mperego
Copy link
Collaborator

mperego commented Feb 18, 2021

OK, thanks! I'll start working on that. For sure w_Resid evaluator needs to be converted to have StokesFOThermo coupled to work.

@mperego
Copy link
Collaborator

mperego commented Feb 18, 2021

@mcarlson801 B.t.w. I confirm that w_Resid is the only evaluator that needs to be changed in order to use the collapsed layout in StokesFOThermoCopled.

@mperego
Copy link
Collaborator

mperego commented Feb 21, 2021

@mcarlson801 thanks for all this work. I've merged master into the branch. @mcarlson801 @jewatkins is anything preventing this from being merged? More testing?

@jewatkins
Copy link
Collaborator

@mcarlson801 thanks for all this work. I've merged master into the branch. @mcarlson801 @jewatkins is anything preventing this from being merged? More testing?

I think there's still some lingering issues that Max was going to address today.

@mcarlson801
Copy link
Collaborator Author

@mcarlson801 thanks for all this work. I've merged master into the branch. @mcarlson801 @jewatkins is anything preventing this from being merged? More testing?

I think there's still some lingering issues that Max was going to address today.

The PR now has all the changes I was planning to make and is currently passing all unit tests, the only thing left is testing for GPU builds and then I think Jerry wants to look over it once more for a final review.

@jewatkins
Copy link
Collaborator

I think the trilinos gpu fix is in develop so I'll go ahead and do a final review/test and push if everything looks good.

Copy link
Collaborator

@jewatkins jewatkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more things. I think we're close. All performance tests pass except one (I think it's a known stability issue which I need to look into).

@mperego
Copy link
Collaborator

mperego commented Feb 23, 2021

@mcarlson801 what's the ETC for this PR? We have enother PR that is ready to be merged in. My understanding is that they are complitely orthogonal. Should we merge the other PR first?

@jewatkins
Copy link
Collaborator

@mcarlson801 what's the ETC for this PR? We have enother PR that is ready to be merged in. My understanding is that they are complitely orthogonal. Should we merge the other PR first?

There's a few minor edits and Max is seeing nondeterministic behavior in a performance test that needs to be tracked down. I don't see an issue with merging the other PR first.

@mperego
Copy link
Collaborator

mperego commented Feb 23, 2021

@mcarlson801 what's the ETC for this PR? We have enother PR that is ready to be merged in. My understanding is that they are complitely orthogonal. Should we merge the other PR first?

There's a few minor edits and Max is seeing nondeterministic behavior in a performance test that needs to be tracked down. I don't see an issue with merging the other PR first.

@jewatkins OK! I merged the other PR. Sorry for the nondeterministic behavior.. those are painful to fix.

…on, cleaned up STKDiscretization, added comments for future work in CellToSide and ComputeBasisFunctionsSide
@mcarlson801 mcarlson801 merged commit f751dc6 into master Feb 25, 2021
@mcarlson801 mcarlson801 deleted the ali-velocity-kokkos-refactor branch February 25, 2021 22:32
@mperego
Copy link
Collaborator

mperego commented Feb 25, 2021

@mcarlson801 thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants