Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using the custom ninja build rebuilds fortran files every time #261

Closed
bathmatt opened this issue Apr 1, 2016 · 36 comments
Closed

using the custom ninja build rebuilds fortran files every time #261

bathmatt opened this issue Apr 1, 2016 · 36 comments
Labels
impacting: configure or build The issue is primarily related to configuring or building MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. stage: ready The issue is ready to be worked in a Kanban-like process

Comments

@bathmatt
Copy link
Contributor

bathmatt commented Apr 1, 2016

@bartlettroscoe
Everytime configure runs all the fortran files are rebuilt using the custom ninja. I edit the cmakefiles.txt in panzer and it recompiles all the fortran files, 1300 or so of them.

@bathmatt bathmatt added the impacting: configure or build The issue is primarily related to configuring or building label Apr 1, 2016
@bradking
Copy link
Contributor

bradking commented Apr 6, 2016

@bathmatt and I found that ninja -d explain shows many lines like:

ninja explain: output packages/epetra/src/CMakeFiles/epetra.dir/Epetra_dcrsmm.f-pp.f older than most recent input /.../Trilinos/builds/build_for_panzer/gcc/debug (...)

This is a dependency on a directory instead of a file and causes a rebuild whenever the directory modification time is updated. In the Fortran preprocessor output file

packages/epetra/src/CMakeFiles/epetra.dir/Epetra_dcrsmm.f-pp.f

we see the line:

# 1 "/.../Trilinos/builds/build_for_panzer/gcc/debug//"

This line tells CMake's dependency extraction that the preprocessor used the directory as input. This may be a bug in the compiler, but I've taught CMake to tolerate it in commit Kitware/CMake@f831d75 by filtering out dependencies that are not on actual files.

@bartlettroscoe
Copy link
Member

-----Original Message-----
From: Bettencourt, Matthew
Sent: Wednesday, April 06, 2016 11:08 AM
To: Bartlett, Roscoe A; Brad King
Subject: Re: [EXTERNAL] Re: using the custom ninja build rebuilds fortran files
every time (#261)

I leave for travel so I won't be able to test this fix for quite some time.

I did try the fix slightly and it looked to make the problem worse, not better as
many more files were recompiling if I canceled a build mid way. I didn't have
time to compile the full code.

@bradking
Copy link
Contributor

bradking commented Apr 6, 2016

it looked to make the problem worse, not better as many more files were recompiling

This is unlikely because all the change does is remove (incorrect) dependencies. The behavior Matthew observed is likely due to other environment changes (e.g. the location of cmake). We'll have to wait for Matthew to have time to do a fresh build, bring it up to date, and then try to reproduce the original problem. I cannot reproduce it because it requires a compiler whose preprocessor generates these line directives referencing directories.

@bartlettroscoe
Copy link
Member

Depending on how long Matt is gone, I may be able to try this out before Matt gets back as part of getting the CUDA builds to work as part of #172.

@bartlettroscoe bartlettroscoe added the stage: ready The issue is ready to be worked in a Kanban-like process label Apr 6, 2016
@bathmatt
Copy link
Contributor Author

The patch that you suggested causes over 2000 files to be rebuilt every time, I reverted it,

Sorry this has taken so long to reply, but I was out of the country.

@bradking
Copy link
Contributor

The patch that you suggested causes over 2000 files to be rebuilt every time, I reverted it,

That should not be possible. All the patch does is filter out some dependencies. I suspect some rebuilds may be caused by inconsistencies left due to the switch between versions. Please leave the patch applied, create a whole new build tree from scratch, allow a full build to complete, and then run ninja -d explain to rebuild with some details about why anything might rebuild.

@bathmatt
Copy link
Contributor Author

There was a compilation error with about 60 files to build, did the explain after that, went back to 2555 files to build., attached
explain.txt

@bradking
Copy link
Contributor

The explain.txt you provided contains only a partial build log

[66/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_spinwait.cpp.o
[67/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Threads/Kokkos_ThreadsExec_base.cpp.o
[68/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Qthread/Kokkos_QthreadExec.cpp.o
[69/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/calc_decomp_cuts.C.o
[70/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Cuda/Kokkos_Cuda_BasicAllocators.cpp.o
[71/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_AllocationTracker.cpp.o
[72/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Threads/Kokkos_ThreadsExec.cpp.o
[73/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/__/rtcompiler/Bessel_I.C.o
[74/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Qthread/Kokkos_Qthread_TaskPolicy.cpp.o
[75/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_CPUDiscovery.cpp.o
[76/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/__/mesh_spec_lt/pamgen_fudges.C.o
[77/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/__/mesh_spec_lt/pamgen_element_dictionary.C.o

Running ninja -d explain should produce lines of the form:

ninja explain: some/output older than most recent input some/input
ninja explain: some/path is dirty

That is what explains why things are rebuilding.

@bathmatt
Copy link
Contributor Author

I've done a rm -rf on the build dir, there is an error in my build with
about 60 files left, I can't find it in all the output, but when I try to
build the rest I get the need to build 2555 files.

This is with the patch, It looks like something with ninja and nvcc_wrapper
and the general trilinos configure. Not sure what.

On Thu, Apr 28, 2016 at 6:33 AM, Brad King [email protected] wrote:

The explain.txt you provided contains only a partial build log

[66/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_spinwait.cpp.o
[67/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Threads/Kokkos_ThreadsExec_base.cpp.o
[68/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Qthread/Kokkos_QthreadExec.cpp.o
[69/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/calc_decomp_cuts.C.o
[70/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Cuda/Kokkos_Cuda_BasicAllocators.cpp.o
[71/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_AllocationTracker.cpp.o
[72/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Threads/Kokkos_ThreadsExec.cpp.o
[73/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir//rtcompiler/Bessel_I.C.o
[74/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/Qthread/Kokkos_Qthread_TaskPolicy.cpp.o
[75/2555] Building CXX object packages/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_CPUDiscovery.cpp.o
[76/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/
/mesh_spec_lt/pamgen_fudges.C.o
[77/2555] Building CXX object packages/pamgen/src/CMakeFiles/pamgen.dir/__/mesh_spec_lt/pamgen_element_dictionary.C.o

Running ninja -d explain should produce lines of the form:

ninja explain: some/output older than most recent input some/input
ninja explain: some/path is dirty

That is what explains why things are rebuilding.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#261 (comment)

@bathmatt
Copy link
Contributor Author

Also, this is only an issue when I use ninja, not when I build with make.

@bradking
Copy link
Contributor

when I try to build the rest I get the need to build 2555 files

This is the part for which we need ninja -d explain output. It should print out an explanation right at the beginning for at least some of the files before it starts building them.

@bathmatt
Copy link
Contributor Author

I attached that previously, and you mentioned it was for an incomplete
build, should I attach it again? I'm not sure what you are asking for
since I provided the file in the past.

On Thu, Apr 28, 2016 at 10:52 AM, Brad King [email protected]
wrote:

when I try to build the rest I get the need to build 2555 files

This is the part for which we need ninja -d explain output. It should
print out an explanation right at the beginning for at least some of the
files before it starts building them.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#261 (comment)

@bathmatt
Copy link
Contributor Author

I'm attaching a fresh copy of the explain output,
output.txt

@bradking
Copy link
Contributor

I'm saying that the file does not have the output that it should. Please run this:

$ ninja -d explain >log 2>&1

and send it to me via private email.

@bathmatt
Copy link
Contributor Author

Do you want me to wait for it to compile the files that it can? That is about 1-1.5 hours to go through the list or can I hit ^C once it starts compiling?

@bradking
Copy link
Contributor

Thanks for sending the file. I see output like this in it:

ninja explain: output /tmp/tmpxft_00004c86_00000000-14_gtest-all.ii of phony edge with no inputs doesn't exist
ninja explain: /tmp/tmpxft_00004c86_00000000-14_gtest-all.ii is dirty

These appear to be temporary files used by the compiler wrapper to do preprocessing. This causes the preprocessor output to have # <line> <file> directives that report temporary files. CMake thinks these are real dependencies and tells Ninja about them. Of course they quickly go away so on the next build Ninja thinks things are out of date. I will have to re-think how CMake extract the preprocessing dependencies from Fortran sources.

@bradking
Copy link
Contributor

The reason CMake tries to extract dependencies from the # <line> <file> lines in preprocessor output is because it needs to generate a "depfile" for Ninja to load. Normally for C++ compilation we ask the compiler to generate the depfile with the -MD flag. However, gfortran produces an incorrect file for our use case. One can see this by trying it with an empty file:

$ >foo.f
$ gfortran -cpp -MD -MT foo.f-pp.f -MF foo.f-pp.f.d -E foo.f >/dev/null
$ cat foo.f-pp.f.d                                                                                                                                                                                       
foo.o foo.f-pp.f: foo.f

The foo.o should not be there and Ninja rejects it. This is why CMake tries to generate the file itself.

Please try running the above example with your Fortran compiler substituted for gfortran (it may need an option different than -cpp to enable preprocessing). What is the output and what is the content of the generated depfile?

@bathmatt
Copy link
Contributor Author

I'm using gfortran as well.

foo.o foo.f-pp.f: foo.f

On Thu, Apr 28, 2016 at 11:51 AM, Brad King [email protected]
wrote:

The reason CMake tries to extract dependencies from the #
lines in preprocessor output is because it needs to generate a "depfile"
for Ninja to load. Normally for C++ compilation we ask the compiler to
generate the depfile with the -MD flag. However, gfortran produces an
incorrect file for our use case. One can see this by trying it with an
empty file:

$ >foo.f
$ gfortran -cpp -MD -MT foo.f-pp.f -MF foo.f-pp.f.d -E foo.f >/dev/null
$ cat foo.f-pp.f.d foo.o foo.f-pp.f: foo.f

The foo.o should not be there and Ninja rejects it. This is why CMake
tries to generate the file itself.

Please try running the above example with your Fortran compiler
substituted for gfortran (it may need an option different than -cpp to
enable preprocessing). What is the output and what is the content of the
generated depfile?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#261 (comment)

@bradking
Copy link
Contributor

I'm using gfortran as well.

Please send me your build.ninja and rules.ninja files via private email.

@bradking
Copy link
Contributor

Looking again at the build log you sent I'll quote another line of context this time:

ninja explain: output /tmp/tmpxft_00004c86_00000000-14_gtest-all.ii of phony edge with no inputs doesn't exist
ninja explain: /tmp/tmpxft_00004c86_00000000-14_gtest-all.ii is dirty
ninja explain: commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o is dirty

This seems to indicate that the bogus dependency is actually attached to a regular C++ source and has nothing to do with the Fortran support or the # <line> <file> extraction in CMake.

Please run

$ ninja -t deps > deps.log ; gzip deps.log

and send me the deps.log.gz file via private email.

@bradking
Copy link
Contributor

From the ninja -t deps output you sent I see:

commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o: #deps 1, deps mtime 1461868706 (STALE)
    /tmp/tmpxft_0000e6c3_00000000-14_gtest-all.ii

This indicates that Ninja has recorded a dependency on that temporary file. This is on a C++ build rule and has nothing to do with Fortran support. Here are the relevant pieces of the build.ninja and rules.ninja files:

build commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o: CXX_COMPILER__gtest /.../commonTools/gtest/gtest/gtest-all.cc
  DEP_FILE = commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o.d
  FLAGS = -std=c++11 -g -O0
  INCLUDES = -I. -I/.../commonTools/gtest
  OBJECT_DIR = commonTools/gtest/CMakeFiles/gtest.dir
  OBJECT_FILE_DIR = commonTools/gtest/CMakeFiles/gtest.dir/gtest
  TARGET_COMPILE_PDB = commonTools/gtest/CMakeFiles/gtest.dir/
  TARGET_PDB = commonTools/gtest/libgtest.pdb

rule CXX_COMPILER__gtest
  depfile = $DEP_FILE
  deps = gcc
  command = /projects/install/rhel6-x86_64/sems/compiler/gcc/4.9.2/openmpi/1.10.1/bin/mpicxx   $DEFINES $INCLUDES $FLAGS -MMD -MT $out -MF $DEP_FILE -o $out -c $in
  description = Building CXX object $out

Somehow the compiler is reporting a dependency on a temporary file to Ninja. I cannot reproduce this. I tried building Trilinos with Open MPI 1.10.2 wrapping around GNU 4.9.3 and got:

$ ninja -t deps commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o
commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o: #deps 2, deps mtime 1461874185 (VALID)
    /.../commonTools/gtest/gtest/gtest-all.cc
    /.../commonTools/gtest/gtest/gtest.h

@bradking
Copy link
Contributor

I think the next step for you is to try building with plain upstream CMake and plain upstream Ninja but not turning on Fortran support. Enable only the Gtest package:

... -DTrilinos_ENABLE_Fortran=OFF -DTrilinos_ENABLE_Gtest=ON ...

@bathmatt
Copy link
Contributor Author

I can't turn off fortran with this because of some dependence issues. I get mapvarlib or something like that issue.

@bartlettroscoe
Copy link
Member

@bradking, are there no machines at Kitware that have a GPU with CUDA installed?

@bradking
Copy link
Contributor

I can't turn off fortran with this because of some dependence issues. I get mapvarlib or something like that issue.

Make sure no other packages besides Gtest are ON, e.g.:

... -DTrilinos_ENABLE_ALL_PACKAGES:BOOL=OFF -DTrilinos_ENABLE_Fortran=OFF -DTrilinos_ENABLE_Gtest=ON ...

@bradking
Copy link
Contributor

are there no machines at Kitware that have a GPU with CUDA installed?

What does this have to do with CUDA? Matthew is using the MPI compilers.

@bartlettroscoe
Copy link
Member

What does this have to do with CUDA? Matthew is using the MPI compilers.

I believe that Matt's setup is using OpenMPI which is using nvcc_wrapper (a wrapper for nvcc) as the native C++ compiler. Is that right @bathmatt?

@bathmatt
Copy link
Contributor Author

correct.

What does this have to do with CUDA? Matthew is using the MPI compilers.

I believe that Matt's setup is using OpenMPI which is using nvcc_wrapper (a wrapper for nvcc) as the native C++ compiler. Is that right @bathmatt?

@bradking
Copy link
Contributor

using OpenMPI which is using nvcc_wrapper (a wrapper for nvcc) as the native C++ compiler.

What are your values for OMPI_{CC,CXX,FC}?

@bathmatt
Copy link
Contributor Author

nvcc_wrapper
It's in packages/kokkos/config

@crtrott
Copy link
Member

crtrott commented Apr 28, 2016

Matt, I hope you mean that OMPI_CXX=nvcc_wrapper. OMPI_CC and OMPI_FC should be empty.
Also Brad, the nvcc_wrapper is a script which allows us to use NVCC in a more seamless fashion as our main compiler. That is necessary since with Kokkos, Cuda code gets dispersed throughout all of our code basis. It is not isolated to .cu files but pops up in any .cpp (or .C) file depending on Kokkos.

@bathmatt
Copy link
Contributor Author

@crtrott correct,

@bradking
Copy link
Contributor

Okay, now that I understand what compilers are actually being run, I've reproduced this:

$ export PATH=/path/to/Trilinos/packages/kokkos/config:$PATH
$ export OMPI_CC=gcc-4.9
$ export OMPI_CXX=nvcc_wrapper
$ export OMPI_FC=gfortran-4.9
$ export NVCC_WRAPPER_DEFAULT_COMPILER=g++-4.9
$ export CC=mpicc CXX=mpicxx FC=mpifort
$ cmake ../Trilinos -GNinja -DTrilinos_ENABLE_Fortran=OFF -DTrilinos_ENABLE_Gtest:STRING=ON -DTPL_ENABLE_Matio=OFF

$ ninja commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o
[1/1] Building CXX object commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o

$ ninja -t deps commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o
commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o: #deps 1, deps mtime 1461877049 (VALID)
    /tmp/tmpxft_000037f8_00000000-14_gtest-all.ii

$ ninja -d explain commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o
ninja explain: output /tmp/tmpxft_000037f8_00000000-14_gtest-all.ii of phony edge with no inputs doesn't exist
ninja explain: /tmp/tmpxft_000037f8_00000000-14_gtest-all.ii is dirty
[1/1] Building CXX object commonTools/gtest/CMakeFiles/gtest.dir/gtest/gtest-all.cc.o

This is with a CMake and Ninja that are not modified with my Fortran work. Further investigation will be needed to see how that dependency ends up there.

@bradking
Copy link
Contributor

Since the discussion here got side-tracked by figuring out how to reproduce the issue, and ended up concluding it is not related to the custom CMake or Ninja versions, I've opened a new issue #321 to record the actual problem. I propose closing this issue in favor of that one, and moving discussion over there.

@bartlettroscoe
Copy link
Member

FYI: As is being tracked in:

the updated nvcc_wrapper that @bradking provided that got merged into kokkos/kokkos 'develop' then snapshoted to Trilinos 'develop' some time ago resolved many of the rebuild issues. But there are still some cases that cause unnecessary rebuilds to occur. Kitware will look into this once they can get on hansen or shiller where they can reproduce the behavior.

After that, the new automated ATDM build of CUDA submitting to the Trilinos CDash site will switch back over to use ninja and will help maintain this capability.

@github-actions
Copy link

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and remove the MARKED_FOR_CLOSURE label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impacting: configure or build The issue is primarily related to configuring or building MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. stage: ready The issue is ready to be worked in a Kanban-like process
Projects
None yet
Development

No branches or pull requests

4 participants