-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PanzerAdaptersIOSS_tIOSSConnManager tests failing in ATDM builds cee-rhel6 builds #3632
Comments
@gsjaardema, it looks like the behavior of SEACAS/Exodus changes when using the custom I talked with @rppawlo and he said that if this is not something that you can help fix, then he is okay with just disabling these tests in ATDM Trilinos builds. |
@rppawlo, just to confirm with you, since none of the ATDM APP are using the code in But I am still concerned that the SPARC way of confugring Trilinos/SEACAS using the magic |
I will try to take a look in Monday. |
yes - fine to disable, though am hoping @gsjaardema can work this out today. |
Okay, let's wait to see if @gsjaardema can get to the bottom of this since I fear this might break EMPIRE. |
FYI: I passed info to @bathmatt to test out EMPIRE to see if it has any new failing tests related to Exodus with this different SEACAS/Exodus NetCDF configuration. |
Question for whoever knows -- it looks like we are using a very old version of NetCDF here -- 4.4.0 even though Sparc has a newer version of NetCDF available -- 4.6.1. Is there a valid reason for using the old version. There have been many bugs fixed and enhancements added from 4.4.0 to 4.6.1. |
@bartlettroscoe what is meant by "magic FindNetCDF.cmake" ? What is the "non-magic" method and is one better than the other? |
@gsjaardema, the "non-magic" method is just a raw listing of header files and libraries as shown in: which is what the EMPIRE Trilinos configuration does. Note that if you switch to using that approach, these Panzer tests pass but some SPARC tests fail. As for the version of NetCDF, we need to consult with @micahahoward and @sebrowne. |
@rppawlo It looks like the failing tests are using a pamgen-generated mesh with no exodus input/output. If that truly is the case, then I am confused as to why a different NetCDF configuration process would affect the testing results since there should be no NetCDF functions being called at all during the testing. I have verified that Not sure what is the issue yet, but just making sure I was not missing something on the tests that were being run. |
@bartlettroscoe Question -- in the configuration section above, we use |
@gsjaardema, as explained at: it means to use the Kokkos serial threading model. |
@bartlettroscoe RE: non-magic building. How do I do a build on a cee-rhel6 machine using the "non-magic" build configuration? |
@gsjaardema - that's part of our confusion. A change to the detection of netcdf should not change the numbering of this test. I suspect that the FindNetcdf module is defining a cmake flag that may change a define in how ioss does numbering. Can you point me to the FindNetcfd code exists? |
The FindNetcdf.cmake module is in
My hypothesis so far is that the differences have something to do with the |
@gsjaardema, you would just have to edit your local copy of Trilinos and change the file |
@bartlettroscoe I thought I could handle the non-magic build, but am unable to get it to pass the tests, so I must be doing it wrong. If you could create a topic branch with a cache var for me to use, that would be appreciated. |
@gsjaardema, okay, let me create the topic branch and test to make sure it is doing the right thing then I will push and point to it. |
This allows you to switch to the EMPIRE way of pulling in the HDF5 and Netcdf TPLs. This was added to aid in the debugging of apparent changing of behavior of SEACAS when using the SPARC way vs. th EMPIRE way of specifiying the HDF5 and Netcdf TPLs (see trilinos#3632).
@gsjaardema, I created the PR #3632 that provides the toggle Interestingly, these tests failed with that configuration as well. That is not my memory but I may be mistaken. Looking at the history for the tests What is it about this SPARC env and TPLs that is causing these tests to fail? It seems it is not the SPARC way of using the custom |
…empire-netcdf-hdf5-config Automatically Merged using Trilinos Pull Request AutoTester PR Title: Add env var ATDM_CONFIG_USE_SPARC_TPL_FIND_SETTINGS (#3632) PR Author: bartlettroscoe
This allows you to switch to the EMPIRE way of pulling in the HDF5 and Netcdf TPLs. This was added to aid in the debugging of apparent changing of behavior of SEACAS when using the SPARC way vs. th EMPIRE way of specifiying the HDF5 and Netcdf TPLs (see trilinos#3632).
@gsjaardema, this issue about the PanzerAdaptersIOSS tests may not be related to the |
Does using different versions of pnetcdf produce different mesh decompositions? If so, these tests should fail, since they are tied to a particular decomposition. |
@jbcarleton. The version of PnetCDF should have no affect on the decomposition |
@gsjaardema, any idea what could be causing the different behavior of SEACAS with these TPLs? How can we go about debugging this? NOTE: We should hopefully find out if this will also impact EMPIRE in the next few days. |
This allows you to switch to the EMPIRE way of pulling in the HDF5 and Netcdf TPLs. This was added to aid in the debugging of apparent changing of behavior of SEACAS when using the SPARC way vs. th EMPIRE way of specifiying the HDF5 and Netcdf TPLs (see trilinos#3632).
@rppawlo It seems there is not enough momentum on this issue.. should we disable the test on the 'cee-rhel6' builds? |
@mperego said
FYI: I have been waiting to run the EMPIRE builds against this 'cee-rhel6' configuration to see if the failure in this test might indicate a change in behavior or SEACAS that would break SPARC. |
its fine to disable |
FYI: As documented in TRIL-242, I verified that after the tweak to the 'cee-rhel6' SPARC ATDM Trilinos configuration in PR #4054 is merged to 'develop', then EMPIRE builds and runs all of its tests just fine. Therefore, it seems that these failing |
With the merge of PR #4079 to 'develop' on 12/19/2018, these tests should now be disabled in the 'cee-rhel6' builds. In fact, we already can see that these tests are missing in some 'cee-rhel6' builds as shown, for example, in the build Trilinos-atdm-cee-rhel6-gnu-4.9.3-openmpi-1.10.2-serial-static-opt today. Unfortunately, due to the crashes of the Trilinos autotester, PR #4079 did not merge until after the first 'cee-rhel6' build ran so these tests still failed today as shown here. |
Looks like these have all been disabled as shown in the table below (with data taken from CDash) Adding the "Disabled Tests" label to filter out of our main queries. @rppawlo, do you want to keep this issue open with the "Disabled Tests" label or just close it? If there are no plans to try to fix this anytime soon, we might as well close this in my opinion. We need to leave the "Disabled Tests" label on this so we can find it if we want to but otherwise could close. Tests with issue trackers Missing: twim=16 (On 2018-12-20<)
|
Fine with closing. It's priority was dropped and we will not address anytime soon. |
Closing. Thanks! |
Don't know why the trigger of turning on extra stuff causes these tests to fail but it was determined that fixing these is not worth it so we disable them. See trilinos#3632.
Don't know why the trigger of turning on extra stuff causes these tests to fail but it was determined that fixing these is not worth it so we disable them. See trilinos#3632.
Don't know why the trigger of turning on extra stuff causes these tests to fail but it was determined that fixing these is not worth it so we disable them. See trilinos#3632.
…s:develop' (7db7806). * trilinos-develop: (23 commits) Fix cmake-file error in stk_balance that was making the m2n exe be a test. tpetra: minor fix; return the values Fix incorrect line length in copy_string change Automatic snapshot commit from seacas at f9bf59a SEACAS: cgns - support self-looping models Disable failing ROL test already known to fail in CUA builds (trilinos#3543) Disable known failing Panzer tests (trilinos#3632) Small formatting change to comment (trilinos#3939) Enable SPARC TPLs and packages on 'waterman' (ATDV-151) ShyLU/FROSch: Correct use of booleans for interface components Don't allow Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-rdc-release-debug-pt to run on 'ride7' (ATDV-155) tpetra: minor additional deprecations trilinos#4839 MiniEM: Fix discrete gradient tpetra: changes to address Mark's comments on trilinos#4839 Xpetra: MueLu: fix issue 4038 ShyLU/FROSch: Use insertGlobalValues instead of insertLocalValues for GlobalCoarseMatrix stokhos: fix compilation error due to tpetra deprecation changes Thyra: fixed compilation error due to deprecation changes tpetra: More deprecations of function arguments involving Node. create*MapWithNode generate_miniFM_* Tpetra: removing Node from argument lists of functions Completed MatrixMarket_Tpetra functions (readSparse, readDense, etc.) Also removed a few compiler warnings reported in clang ...
…s:develop' (7db7806). * trilinos-develop: (30 commits) Fix cmake-file error in stk_balance that was making the m2n exe be a test. Tpetra: Global Ordinal validation tpetra: minor fix; return the values Fix incorrect line length in copy_string change Tpetra: Moved GORDS logic to right file this time, really. Tpetra: GORDS Deprecation Cleanup Tpetra: Relocated # GORDS validation logic to packages/tpetra/core/CMakeLists.txt Tpetra: clean up deprecation WIP tags Tpetra: Add deprecations for global ordinal types Automatic snapshot commit from seacas at f9bf59a SEACAS: cgns - support self-looping models Disable failing ROL test already known to fail in CUA builds (trilinos#3543) Disable known failing Panzer tests (trilinos#3632) Small formatting change to comment (trilinos#3939) Enable SPARC TPLs and packages on 'waterman' (ATDV-151) ShyLU/FROSch: Correct use of booleans for interface components Don't allow Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-rdc-release-debug-pt to run on 'ride7' (ATDV-155) tpetra: minor additional deprecations trilinos#4839 Ifpack2 - fix issue 4858 MiniEM: Fix discrete gradient ...
…s:develop' (7db7806). * trilinos-develop: (30 commits) Fix cmake-file error in stk_balance that was making the m2n exe be a test. Tpetra: Global Ordinal validation tpetra: minor fix; return the values Fix incorrect line length in copy_string change Tpetra: Moved GORDS logic to right file this time, really. Tpetra: GORDS Deprecation Cleanup Tpetra: Relocated # GORDS validation logic to packages/tpetra/core/CMakeLists.txt Tpetra: clean up deprecation WIP tags Tpetra: Add deprecations for global ordinal types Automatic snapshot commit from seacas at f9bf59a SEACAS: cgns - support self-looping models Disable failing ROL test already known to fail in CUA builds (trilinos#3543) Disable known failing Panzer tests (trilinos#3632) Small formatting change to comment (trilinos#3939) Enable SPARC TPLs and packages on 'waterman' (ATDV-151) ShyLU/FROSch: Correct use of booleans for interface components Don't allow Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-rdc-release-debug-pt to run on 'ride7' (ATDV-155) tpetra: minor additional deprecations trilinos#4839 Ifpack2 - fix issue 4858 MiniEM: Fix discrete gradient ...
CC: @trilinos/panzer , @mperego (Trilinos Discretizations Product Lead), @bartlettroscoe
Next Action Status
EMPIRE works just fine against these 'cee-rhel6' builds (see TRIL-242) so tests failing tests are not indicative of any problems for EMPIRE. With the merge of PR #4079 to 'develop' on 12/19/2018, these tests are now be disabled in the 'cee-rhel6' builds are were shown to be missing on 12/20/2018.
Description
As shown in this query the tests:
are failing in the builds:
Current Status on CDash
To see the current status of these tests on CDash, click on the below link:
NOTES:
Steps to Reproduce
One should be able to reproduce this failure on any CEE LAN RHEL6 SRN as described in:
More specifically, the commands given for the systemCEE LAN RHEL6 SRN are provided at:
The exact commands to reproduce this issue should be:
The text was updated successfully, but these errors were encountered: