Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get CMake build system to take advantage of kokkos/Makefile.kokkos in setting compilers, compiler options etc. #1400

Closed
bartlettroscoe opened this issue Jun 6, 2017 · 80 comments
Assignees
Labels
pkg: Kokkos stage: in progress Work on the issue has started

Comments

@bartlettroscoe
Copy link
Member

bartlettroscoe commented Jun 6, 2017

CC: @kruger, @crtrott, @nmhamster, @dsunder, @ibaned

Next Action Status:

A new plan was discussed in detail on 8/1/2017. Next:@kruger will write up the detailed plan in kokkos/kokkos#878 and then track work there ...

Description:

The native Kokkos Makefile system takes advantage of fairly sophisticated makefile logic to define compilers, compiler options, and libraries given the input variables:

  • KOKKOS_ARCH
  • KOKKOS_DEBUG
  • KOKKS_CXX_STANDARD
  • etc. (could be more)

All of this logic is contain in the file kokkos/Makefile.kokkos shown at:

and this is in the Trilinos snapshot of kokkos shown here:

The goal is to leverage kokkos/Makefile.kokkos as much as possible in the CMake build system for Kokkos under Trilinos. We want the Trilinos CMake build of kokkos to produce, as much as possible, to create identical build lines to what the native Kokkos makefile system does.

The process might be to to:

  1. Define CMake cache Trilinos_ARCH (and other already-defined Trilinos CMake vars) and then set the Kokkos input vars shown above.

  2. Execute a dummy makefile pulling in Makefile.kokkos and extracting the out KOKKOS_CXXFLAGS, KOKKOS_LDFLAGS, etc., and the generated file KokkosCore_config.h file (and use it for the Kokkos_config.h file).

  3. Set CMAKE_CXX_FLAGS, etc. using these outputs.

I think this needs to be done at the global level during the env probing (so it might need a new TriBITS callback function). But we will have to see.

Repos, Issues, PRs, and Branches:

The ATDM Tools & Dev Env issue tracking this work is:

The three GitHub repos, Issues, PRs, and branches involved are:

To setup the repos and branches from scratch, do:

$ git clone [email protected]:trilinos/Trilinos.git
$ cd Trilinos/
$ git remote add tech-xcorp [email protected]:Tech-XCorp/Trilinos.git
$ git fetch tech-xcorp
$ git checkout --track tech-xcorp/trilinos-kokkos-1400

$ git clone [email protected]:TriBITSPub/TriBITS.git
$ cd TriBITS/
$ git checkout --track origin/kokkos-config-207
$ cd ..

$ git clone [email protected]:kokkos/kokkos.git
$ cd kokkos/
$ git remote add tech-xcorp [email protected]:Tech-XCorp/kokkos.git
$ git fetch tech-xcorp
$ git checkout --track tech-xcorp/issue-878 
$ cd ..

With that setup, you should see:

$ ./cmake/tribits/python_utils/gitdist --dist-repos=.,TriBITS,kokkos dist-repo-status
---------------------------------------------------------------------------------------------
| ID | Repo Dir        | Branch               | Tracking Branch                 | C | M | ? |
|----|-----------------|----------------------|---------------------------------|---|---|---|
|  0 | Trilinos (Base) | trilinos-kokkos-1400 | tech-xcorp/trilinos-kokkos-1400 |   |   |   |
|  1 | TriBITS         | kokkos-config-207    | orign/kokkos-config-207         |   |   |   |
|  2 | kokkos          | issue-878            | tech-xcorp/issue-878            |   |   |   |
---------------------------------------------------------------------------------------------

With that, one can configure and bulid Trilinos using the version of TriBITS and kokkos in those branches by adding the cmake configure options:

$ cmake \
  -DTrilinos_TRIBITS_DIR:STRING=TriBITS/tribits \
  -DKokkos_SOURCE_DIR_OVERRIDE:STRING=kokkos \
  [option options] \
  ${TRILINOS_SRC_DIR}
@bartlettroscoe
Copy link
Member Author

@nmhamster, I think this is the feature that you have been requesting for some time (we talked about this at the ECP meeting this last winter). Once @kruger has time available (after dealing with #1208 some more), then perhaps we can set up a short meeting between the four of us (you, @crtrott, @kruger, and myself) to discuss this?

@bartlettroscoe bartlettroscoe added the stage: ready The issue is ready to be worked in a Kanban-like process label Jun 6, 2017
@ibaned
Copy link
Contributor

ibaned commented Jun 8, 2017

I would think the way to implement this is to copy some of that logic into the Kokkos TriBITS-based build system. Since that involves changes to the Kokkos directory, I've created kokkos/kokkos#878 to mirror this issue.

@kruger
Copy link
Contributor

kruger commented Jun 28, 2017

Could we have a call either tomorrow (Thursday, June 29) or Friday (June 30)?

@crtrott
Copy link
Member

crtrott commented Jul 11, 2017

OK so the plan could be: split the kokkos.cmake files into three things:

  • Compiler Detection
  • Architecture Settings
  • Kokkos Specific Things

The first two files could than be used directly in Trilinos. For example our integration process could take care of copying them from a kokkos subdirectory into a top level cmake directory inside of Trilinos. That way we would keep up with adding the right options, and Trilinos just inherits everything we support inside of Kokkos. Putting the stuff into a top level cmake directory would allow Trilinos to use this even for projects which do not depend on Kokkos. If that is not necessary, than maybe the whole stuff could stay inside of the Kokkos directory.

@crtrott
Copy link
Member

crtrott commented Jul 11, 2017

Opened issue kokkos/kokkos#947

@bartlettroscoe
Copy link
Member Author

The first two files could than be used directly in Trilinos.

We should not need to copy any files. We can just point the Trilinos CMake build files to pull in the files sitting in the packages/kokkos/ subdir.

That is the simplest thing to do and simplifies integration. The justification for having Trilinos directly point to CMake configure files in the kokkos/ directory is that Kokkos is serving as an abstraction for on-node parallelism, like MPI serves as an abstraction layer for intra-node parallelism. MPI implementations provide compiler wrappers and special considerations for MPI are made during configure time. Likewise, we can use Kokkos in special configure logic for Trilinos to help handle configuring for on-node parallelism. Seems like a natural comparison to me.

@kruger, please do the following:

  1. Please converse with @crtrott in this issue and CMake: split kokkos.cmake into compiler, architecture and kokkos specific files kokkos/kokkos#947 about how to split up kokkos.cmake and reuse the pieces. And understand how these pieces work in the current native Kokkos CMake build (especially the Compiler Detection logic).

  2. Ask me questions about how to inject Compiler Detection, Architecture Settings, and Kokkos Specific things into the Trilinos TriBITS CMake build. (This might require a new extension point for TriBITS but we will see.)

Thanks for working on this!

@kruger
Copy link
Contributor

kruger commented Jul 11, 2017

@bartlettroscoe, I am following your train of logic and will have start my review of current TriBITS design to ask intelligent questions. My understanding of kokkos.cmake is sufficient to start breaking that up (I believe).

@kruger
Copy link
Contributor

kruger commented Jul 12, 2017

I think I have a good handle on kokkos.cmake.

I am going through the TriBITS documentation carefully (see pull request on the TriBITS repo).

For the TriBITS mods, it seems that the main issue is going to be the order. Looking at:
**Full Processing of TriBITS Project Files:**
in the TribitsDeveleopersGuide.rst. I'll be working in step 10, which comes after step 9 so that I have the needed info (*_ENABLE_KOKKOS where I assume * should be TRIBITS rather than Trilinos?).

Then should KOKKOS be considered one of the standard TPLs? (TribitsStandardTPLsList.txt) If so, should the MockTrilinos example be the place to test the TriBITS development?

There are lots of details here that I still need to go through, but I'm getting a sense of how it all fits together.

@ibaned
Copy link
Contributor

ibaned commented Jul 12, 2017

The Kokkos team talked about this briefly today and we were under the impression that these files would remain in the Kokkos repository and would be included by Trilinos CMake files, but not TriBITS itself. We probably need to get on the same page about this design.

@bartlettroscoe
Copy link
Member Author

The Kokkos team talked about this briefly today and we were under the
impression that these files would remain in the Kokkos repository and
would be included by Trilinos CMake files, but not TriBITS itself. We
probably need to get on the same page about this design.

TriBITS would not have explicit knowledge of kokkos. Instead, inclusions of Kokkos files would be done in files that live in the Trilinos git repo that are processed by TriBITS. It is just that TriBITS may need new points of specialization to allow taking advantage of compiler detection and architecture settings. That is all.

Does that make sense or is this just more confusing?

@kruger
Copy link
Contributor

kruger commented Jul 12, 2017

They are. I should have something for the Kokkos team to review by the end of the week. I'll be working on a TechX fork of Kokkos itself for that.

In my previous comment, I am jumping ahead a bit to Ross's other issues which has to do with having information from a package propagate up to a fairly high level (compiler specification) in the Trilinos build. I'd like to understand the basics of that before refactoring kokkos.cmake.

@ibaned
Copy link
Contributor

ibaned commented Jul 12, 2017

I think I get it, there is a need for new interfaces in TriBITS for Trilinos to relay the results of the Kokkos detection code. That seems consistent.

@nmhamster
Copy link
Contributor

@ibaned @kruger - Kokkos doesn't actually detect the ARCH or the backends, the user picks these. So we need a TriBITS option around which of these are being selected. Backends is kind of done, but ARCH is not and this implies compile and link flags.

@bartlettroscoe
Copy link
Member Author

@kruger,

There are lots of details here that I still need to go through, but I'm getting a sense of how it all fits together.

Sounds great. I will look over TriBITSPub/TriBITS#206 and then we can set up a time to talk about what is needed to get this working.

@bartlettroscoe
Copy link
Member Author

We will converse about this in TriBITSPub/TriBITS#207.

@bartlettroscoe
Copy link
Member Author

@kruger,

FYI: @hcedwar just informed the SART group that @dsunder has been refactoring the kokkos.cmake and related build files in Kokkos to remove duplication. Please coordinate with @dsunder on syncing up your refactoring and his refactoring (or someone will have a messing merge conflict to address).

@kruger
Copy link
Contributor

kruger commented Jul 26, 2017

@dsunder Can we have a phone call to discuss this?

@bartlettroscoe
Copy link
Member Author

Just had a meeting with @dsunder, @ibaned, @kruger and myself to discuss this. We discussed a plan to basically go back and utilize the logic in kokkos/Makefile.kokkos to go from the compiler selected and the ARCH option and spit out a *.cmake fragment file that sets KOKKOS_CXX_FLAGS, KOKKOS_LDFLAGS, etc. and then a CMake build will just use those.

@kruger will write up the more detailed plan in kokkos/kokkos#878.

@bartlettroscoe
Copy link
Member Author

Update:

I meet with @kruger yesterday and he about has things worked out for the needed changes to Kokkos on the branch:

He just needs to work on the automated tests for this some (e.g. change from full diffs to targeted greps of generated files for CMake and Make build systems) and then the branch should be ready for the Kokkos developers to consider merging into Kokkos 'develop'. However, I mentioned to @kruger that before Kokkos merges this branch with Kokkos 'develop', we first want to finish the updates to TriBITS and Trilinos as per the plan in TriBITSPub/TriBITS#207 and prototyped on the branch:

Then, once we have branches for Kokkos, TriBITS and Trilinos ready to fully test, then we can go to a couple of ATDM platforms and try out setting -DTrilinos_ARCH=<arch> and -DTrilinos_PLATFORM=<platform> (or whatever they will be called) and see if things work the way they should. @kruger is going to contact @nmhamster and @crtrott to ask what ATDM machines would be good test cases for this.

@bartlettroscoe
Copy link
Member Author

@kruger,

Here is how you can develop on the Trilinos, TriBITS, and kokkos branches and test with Trilinos.

First, clone (or move) the TriBITS and kokkos GitHub repos under Trilinos:

$ cd Trilinos/
$ git clone [email protected]:TriBITSPub/TriBITS.git
$ git clone [email protected]:kokkos/kokkos.git

(or create symlinks to repos already there).

Then checkout your branches:

$ cd Trilinos/
$ git checkout <the-trilinos-branch>
$ cd TriBITS/
$ git checkout <the-tribits-branch>
$ cd ..
$ cd kokkos/
$ git checkout <the-kokkos-branch>

Then just configure Trilinos as normal except add the cache vars:

$ cmake \
  -DTrilinos_TRIBITS_DIR:STRING=TriBITS/tribits \
  -DKokkos_SOURCE_DIR_OVERRIDE:STRING=kokkos \
  [other options]

bartlettroscoe added a commit to bartlettroscoe/kokkos that referenced this issue Oct 13, 2017
)

This is what the kokkos/core/src/CMakeLists.txt file would look like if we
added TRIBITS_INSTALL_HEADERS() (see TriBITS backlog Google Doc).  That would
not be hard to implement on the TriBITS side and wouild avoid installing
header files when the user does not want to (i.e. when configuring with
-D<Project>_INSTALL_LIBRARIES_AND_HEADERS=OFF.)
@bartlettroscoe
Copy link
Member Author

@kruger and I meet again today to work out details of how the section of source files would be handled with the TriBITS build system for Kokkos. It turns out that, currently, the TriBITS and raw CMake build system for Kokkos just glob all of the *.hpp and *.cpp files and then ifdefs based on bool vars in the configured KokkosCore_config.h file determine if the file is empty or not.

One thing that we noticed would improve the TriBITS build system Kokkos is to add a TriBITS function TRIBITS_INSTALL_HEADERS() that would glob all the headers and then install them in the same relative directory paths. This is specked out in:

This would be easy to add to TriBITS and would correctly skip the install of the headers when the user did not want them as described in:

which is used in projects like CASL VERA.

bartlettroscoe added a commit that referenced this issue Dec 5, 2017
…kos-promotion (#1400, #2051)

Merging the kokkos-promotion branch into the develop branch will close the PR
Basker.
bartlettroscoe added a commit that referenced this issue Dec 5, 2017
…elop (#2051)

This branch is also merged into the kokkos-promition branch (see #1400) but
that is harmless.  It is just that we should be testing with Scotch before we
merge in kokkos-promotion.
@bartlettroscoe
Copy link
Member Author

FYI: I merged the branch enable-scotch-2051 (see #2054) into the kokkos-promotion branch that enables the Scotch TPL the CI build of Trilinos. I am pushing the branch to develop as well right now (git handles such strategies just fine). After that pushes, I will try to reproduce this ShyLU Basker failure.

@bartlettroscoe
Copy link
Member Author

@srajama1,

Can scotch be added to SEMS modules ?

The updated Scotch TPL from SEMS is now being enabled in the CI build associated with the checkin-test-sems.sh script (see #2051). I added the issue #2065 requesting that the @trilinos/framework team add Scotch to the SEMS-based Trilinos builds that they run as well.

I will try to reproduce the ShyLU Basker failure described above.

@ndellingwood
Copy link
Contributor

@crtrott with these new changes should we continue to use the following lines in configure scripts for Cuda version >= 8 (discussed in issue #1682):

-D Trilinos_CXX11_FLAGS="--expt-extended-lambda -std=c++11" \
-D Kokkos_ENABLE_Cuda_Lambda:BOOL=ON \

@crtrott
Copy link
Member

crtrott commented Dec 12, 2017

Just as FYI: I am in the process of running integration testing now, so this might go in tomorrow (or if I am really really lucky today).

@ibaned
Copy link
Contributor

ibaned commented Dec 12, 2017

@ndellingwood I think both those lines can be omitted if using KOKKOS_ARCH.

@ndellingwood
Copy link
Contributor

ndellingwood commented Dec 12, 2017

@bartlettroscoe @crtrott I did a clean configure/build with the current Kokkos develop branch in the kokkos-promotion branch of Trilinos and I'm still seeing the same ShyLU Basker issue I mentioned above

/Users/ndellin/Research/trilinos-dev/Trilinos/packages/amesos2/src/Amesos2_Basker_FunctionMap.hpp:64:41: fatal error: shylubasker_trilinos_decl.hpp: No such file or directory
compilation terminated.

The CMAKE_CXX_FLAGS:STRING="-DSHYLU_NODEBASKER" is recognized but the path to packages/shylu/shylu_node/basker/src where the file above is located does not seem to be found. The attached configure script above worked (to configure/build/test) with Trilinos develop branch before the CMake changes.

@ibaned
Copy link
Contributor

ibaned commented Dec 12, 2017

@ndellingwood can you separate that out into a Kokkos or Trilinos issue? Theres too much going on here as it is.

@crtrott
Copy link
Member

crtrott commented Dec 12, 2017

No those flags will be uneccessary.

@ndellingwood
Copy link
Contributor

@ibaned I added issue #2075 summarizing all the details scattered in this issue about the build failure with ShyLU Basker enabled.

bartlettroscoe added a commit that referenced this issue Dec 12, 2017
This will catch the current problem of using a var that is no longer defined
which is part of #1400.
@bartlettroscoe
Copy link
Member Author

Okay, I think the issue with ShyLU_NodeBasker mentioned above is now resolved (see #2075). Is there anything else blocking the merge of kokkos-promotion into develop? Is it just the last set of integration tests?

@ibaned
Copy link
Contributor

ibaned commented Dec 13, 2017

I identified and fixed a fairly serious issue with Kokkos headers not being installed to the right place (kokkos/kokkos#1272). This does not technically affect integration tests, and it reinforces the idea that installation testing is pretty important.

@bartlettroscoe
Copy link
Member Author

@ibaned,

I identified and fixed a fairly serious issue with Kokkos headers not being installed to the right place (kokkos/kokkos#1272). This does not technically affect integration tests, and it reinforces the idea that installation testing is pretty important.

How did you catch this? Was this from installing and then building Albany against this?

@ibaned
Copy link
Contributor

ibaned commented Dec 13, 2017

Not Albany, but yes I installed the right Trilinos+Kokkos override combination and tried to link my own downstream library against it.

@crtrott
Copy link
Member

crtrott commented Dec 13, 2017

Dan would you be willing to put an installation testing thing together? We could do that as part of integration or even run a jenkins job.

@ibaned
Copy link
Contributor

ibaned commented Dec 13, 2017

kokkos/kokkos#1276

@crtrott crtrott self-assigned this Dec 15, 2017
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Mar 9, 2018
Otherwise, you can't actaully see what this is set to (the Kokkos CMake code
does not tell you).
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Mar 9, 2018
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Mar 9, 2018
Otherwise, you can't actaully see what this is set to (the Kokkos CMake code
does not tell you).
lucbv pushed a commit to lucbv/Trilinos that referenced this issue Mar 9, 2018
Otherwise, you can't actaully see what this is set to (the Kokkos CMake code
does not tell you).
ndellingwood added a commit to ndellingwood/Trilinos that referenced this issue Mar 12, 2018
* develop: (261 commits)
  Replace PHX_EVALUATOR_CLASS Macros (trilinos#2322)
  Change time-limit on 'ride' from 4 to 12 hours (TRIL-171)
  Disable Panzer examles for CUDA builds on 'ride' (trilinos#2318, TRIL-171)
  Move ATDMDevEnvUtils.cmake to utils/ subdir (TRIL-171)
  Change from ATDM_CONFIG_USE_MAKEIFLES to ATDM_CONFIG_USE_NINJA (TRIL-171)
  Add ATDM_CONFIG_KNOWN_HOSTNAME var for CTEST_SITE (TRIL-171)
  Misc improvements to checkin-test-atdm.sh (TRIL-171)
  Print the return code from 'ctest -S' command (TRIL-171)
  Tpetra: Sort and merge specializations for Tpetra::CrsGraph (trilinos#2354)
  Remove '#line 1' as workaround for Ninja + CUDA rebuilds (trilinos#2359)
  Added function to return the full vector of data in the indexer rather than elem by elem (trilinos#2355)
  Panzer: Fixing rotation matrix calculation in integration values
  Tpetra: Adding an additive Schwarz test
  checkin-test driver scripts for local testing ATDM builds of Trilinos
  Add list of all supported builds on ride (TRIL-171)
  Add list of all supported builds shiller (TRIL-171)
  Fix location of nvcc_wrapper and other fixups (TRIL-171)
  Improve the 'date' output and start and end (TRIL-171)
  Change name ATDM_HOSTNAME to ATDM_SYSTEM_NAME (TRIL-171)
  Print KOKKOS_ARCH to STDOUT (trilinos#1400)
  ...

Conflicts:
	packages/shylu/shylu_node/tacho/src/TachoExp_NumericTools.hpp
	packages/shylu/shylu_node/tacho/unit-test/Tacho_TestCrsMatrixBase.hpp
@bartlettroscoe
Copy link
Member Author

This was completed a long time ago. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: Kokkos stage: in progress Work on the issue has started
Projects
None yet
Development

No branches or pull requests

8 participants