Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ShyLU and Ifpack2 : Add FastILU and TriSolve to Trilinos #197

Closed
srajama1 opened this issue Mar 9, 2016 · 47 comments
Closed

ShyLU and Ifpack2 : Add FastILU and TriSolve to Trilinos #197

srajama1 opened this issue Mar 9, 2016 · 47 comments

Comments

@srajama1
Copy link
Contributor

srajama1 commented Mar 9, 2016

@srajama1 @trilinos/shylu @trilinos/ifpack2

@srajama1 srajama1 self-assigned this Mar 9, 2016
@srajama1 srajama1 added pkg: Ifpack2 pkg: ShyLU stage: ready The issue is ready to be worked in a Kanban-like process labels Mar 9, 2016
@brian-kelley brian-kelley self-assigned this Jul 7, 2017
@brian-kelley brian-kelley added the stage: in progress Work on the issue has started label Jul 7, 2017
@brian-kelley
Copy link
Contributor

@srajama1 @jhux2 Are FastILU, FastIC and FastLDL all supposed to be options for Ifpack2::RILUK and AdditiveSchwarz, or just FastILU?

@mhoemmen
Copy link
Contributor

@brian-kelley AdditiveSchwarz has a parameter which is a sublist, which it passes directly to the subdomain solver. AdditiveSchwarz should not take any subdomain solver parameters directly; they should all go into the sublist.

@srajama1
Copy link
Contributor Author

@brian-kelley : All three are useful preconditioner optiions. To your question on #1476 , please feel free to change the template parameters to to be consistent with other Ifpack2 preconditioners.

@egboman
Copy link
Contributor

egboman commented Jul 10, 2017

@srajama1 Shouldn't FastLDL be renamed FastILDL for consistency?

@brian-kelley
Copy link
Contributor

@egboman I can do that. The class is called LDL even though the shylu files are "shylu_fastildl", so I'll change the class name.

@egboman
Copy link
Contributor

egboman commented Jul 10, 2017

Thanks, @brian-kelley It is a small thing but a bit annoying.

@brian-kelley
Copy link
Contributor

@mhoemmen @srajama1 What is the idiomatic way to have a Scalar in a parameter list? All three preconditioners take "Scalar omega" and "Scalar shift". I was thinking of just printing to and parsing from a string, but I don't know if all scalars support operator<< and operator>> ...

@jhux2
Copy link
Member

jhux2 commented Jul 10, 2017

@mhoemmen @srajama1 What is the idiomatic way to have a Scalar in a parameter list? All three preconditioners take "Scalar omega" and "Scalar shift". I was thinking of just printing to and parsing from a string, but I don't know if all scalars support operator<< and operator>> ...

@brian-kelley Do the omega and shift need to be general? In other solvers, I've found such generality often isn't required. If this is the case here, then you can just hard code it to double.

@brian-kelley
Copy link
Contributor

@jhux2 Ah, I just saw that the existing fast * tests do read them from argv as double, so really the parameters shouldn't be Scalar at all.

@mhoemmen
Copy link
Contributor

@brian-kelley wrote:

What is the idiomatic way to have a Scalar in a parameter list? All three preconditioners take "Scalar omega" and "Scalar shift"

It depends on whether they could ever be complex valued. If so, then best practice would be to test first Scalar, then Teuchos::ScalarTraits<Scalar>::magnitudeType, then double. Not sure if the syntax is exactly right below, so treat it as pseudocode:

const std::string paramName ("whatever the name is");
if (params->isParameter<Scalar> (paramName)) {
  // ... get the parameter as a Scalar value ...
}
else if (params->isParameter<typename Teuchos::ScalarTraits<Scalar>::magnitudeType> (paramName)) {
  // ... get the parameter as a magnitudeType ...
}
else if (params->isParameter<double> (paramName)) {
  // ... get the parameter as a double ...
}
else {
  // ... whatever you might need to do if parameter was not provided, 
  // or if parameter was provided but has the wrong type ...
}

@srajama1
Copy link
Contributor Author

@brian-kelley : I need to think a little bit about this. You can hard code it to double for making progress, but note that we want to do Block version of FastILU soon, which might require block version (or we could use the same shift for an entire block). This needs some thinking.

BTW, I am not getting all updates from github. For example, I didn't get your update but got Jonathan's update.

@egboman
Copy link
Contributor

egboman commented Jul 10, 2017

@srajama1 : I think a global shift by shift*identity should be the default behavior, but doing some kind kind of block matrix shift might be useful, too.

@brian-kelley
Copy link
Contributor

@srajama1 If we need a shift vector instead of just a scalar, we could have another parameter of type "Array(double)" - that's a common pattern in param lists. Above I was concerned about was dealing with many different possible Scalar types, but for now I will just expect double.

@srajama1
Copy link
Contributor Author

@brian-kelley : Let us go with shift as double. We don't have strong use case for variable shifting within blocks. I am not able to find anything in the literature about its usefulness too. Let us just keep it as double for now.

@brian-kelley
Copy link
Contributor

@srajama1 Does "nTriSol" mean "number of triangular solves"? I would like a more verbose name to use as the Ifpack2 parameter name. I used "sweeps" for "nFact", to be consistent with other Ifpack2 classes.

@brian-kelley
Copy link
Contributor

@jhux2 What would a sensible default shift parameter be? The proceedings paper example shows values in the range 0 to about 0.2, and says that a higher value can help convergence issues, but also somewhat increases the number of CG iterations. I was thinking of just using 0.2.

@jhux2
Copy link
Member

jhux2 commented Jul 10, 2017

@srajama1 @egboman Can either of you two advise @brian-kelley on the values for the defaults?

@egboman
Copy link
Contributor

egboman commented Jul 10, 2017

@brian-kelley : Let us go with shift as double. We don't have strong use case for variable shifting within blocks. I am not able to find anything in the literature about its usefulness too. Let us just keep it as double for now.

Is the shift absolute or relative? In the latter case, it would automatically be scaled according to the diagonal entries but I guess both approaches are valid.

@egboman
Copy link
Contributor

egboman commented Jul 10, 2017

@jhux2 What would a sensible default shift parameter be? The proceedings paper example shows values in the range 0 to about 0.2, and says that a higher value can help convergence issues, but also somewhat increases the number of CG iterations. I was thinking of just using 0.2.

The default should be zero. The user needs to turn this on explicitly.

@egboman
Copy link
Contributor

egboman commented Jul 10, 2017

I see the shift is relative in the source, so no need for vector valued shifts.

@srajama1
Copy link
Contributor Author

@brian-kelley : The sweeps here are slightly different in their behavior than the traditional sense, but I am fine with calling it the sweeps. nTriSol is number of iteration in triangular solve.

@brian-kelley
Copy link
Contributor

@srajama1 @jhux2 I have everything set up in Ifpack2 except I can't get cmake to give me a "HAVE_IFPACK2_FASTILU" define. I have "ShyLUFastILU" as an optional dependency in ifpack2/cmake/Dependencies, and I added this to ifpack2/CMakesLists.txt:

TRIBITS_ADD_OPTION_AND_DEFINE(
  ${PACKAGE_NAME}_ENABLE_ShyLUFastILU
  HAVE_${PACKAGE_NAME}_FASTILU
  "Enable FastILU as Additive Schwarz inner preconditioner."
  ${${PROJECT_NAME}_ENABLE_ShyLUFastILU}
  )

which I think should define "HAVE_IFPACK2_FASTILU" whenever I configure with FastILU enabled. There's no configure error, I just don't get the define.

@srajama1
Copy link
Contributor Author

@brian-kelley : I sent you an e-mail. This is easier explained in person.

@jhux2
Copy link
Member

jhux2 commented Jul 12, 2017

@srajama1 @jhux2 I have everything set up in Ifpack2 except I can't get cmake to give me a "HAVE_IFPACK2_FASTILU" define.

@brian-kelley Could it be as simple as ${PACKAGE_NAME} is "Ifpack2", and that ${PACKAGE_NAME_UC} is "IFPACK2"?

@brian-kelley
Copy link
Contributor

@jhux2 I just talked to @srajama1 in person, I was missing the cmakedefine line in Ifpack2_config.h.in but everything works now. Thanks.

@brian-kelley
Copy link
Contributor

@jhux2 If the FastILU preconditioners are going to be used with AdditiveSchwarz, they need to take A as a Tpetra::RowMatrix instead of a Tpetra::CrsMatrix (the underlying matrix from AdditiveSchwarz could be a LocalFilter, an OverlappingRowMatrix, or something else).

If I changed Fic/Filu/Fildl to take RowMatrix, I would need to extract values/rowptrs/colinds one row at a time from the RowMatrix before passing them to the FastILU::Fast*Prec constructor (which just takes raw Kokkos Views). Does this sound like it would be too slow, and is there some better alternative?

@jhux2
Copy link
Member

jhux2 commented Jul 12, 2017

@brian-kelley One option is to dynamic_cast to see if the actual matrix type is one that supports getting raw pointers. But if it is a matrix that doesn't provides the pointers, I don't see another option other than using getrow().

@mhoemmen
Copy link
Contributor

@brian-kelley It could also be a LocalFilter of a CrsMatrix. You may want to test if that could actually occur in practice. If so, there's a faster way to get the entries out.

@brian-kelley
Copy link
Contributor

@mhoemmen Are you referring to LocalFilter::getUnderlyingMatrix()?

@srajama1
Copy link
Contributor Author

@brian-kelley : There are some advantages in keeping the implementation as CrsMatrix specific ones. I would rather not modify FastILU to support row matrix. Instead let the Ifpack2 wrapper use get row when getting the pointers is not possible. Can we also add a timer for this portion, so we understand the overhead in not having CrsMatrices ? Also, this needs to be careful separated between initialize and compute. All the allocations, pointer initialization, column index copies for the new arrays can happen in initialize and the compute is an embarrassingly parallel (row wise) copy of the values array (assuming getrow can be called inside a functor).

@jhux2
Copy link
Member

jhux2 commented Jul 12, 2017

There are some advantages in keeping the implementation as CrsMatrix specific ones. I would rather not modify FastILU to support row matrix. Instead let the Ifpack2 wrapper use get row when getting the pointers is not possible. Can we also add a timer for this portion, so we understand the overhead in not having CrsMatrices ? Also, this needs to be careful separated between initialize and compute. All the allocations, pointer initialization, column index copies for the new arrays can happen in initialize and the compute is an embarrassingly parallel (row wise) copy of the values array (assuming getrow can be called inside a functor).

@srajama1 So in other words, copy the matrix into a CRS format if necessary?

@brian-kelley
Copy link
Contributor

@jhux2 Yes, they will be copied to three (vals, rowptr, indices) locally owned, device space Kokkos Views which are passed directly to the ShyLU class that actually implements the solver. I'll do this in initialize() and it seems straightforward. Thanks!

@srajama1
Copy link
Contributor Author

srajama1 commented Jul 12, 2017

@jhux2: Yes, while taking into account symbolic and numeric steps.

@brian-kelley : The values array cannot be copied in initialize(). Users can call compute() if just the values change, but the structure remains change. We will copy values and call FastILU again in compute for that case.

@mhoemmen
Copy link
Contributor

mhoemmen commented Jul 12, 2017

The matrix that Ifpack2::AdditiveSchwarz passes to the subdomain solver is always an Ifpack2::LocalFilter of something. You may want to consider writing either a fast "copy the local matrix out of LocalFilter" function, or changing the factorization so it can take 4-array CSR (which is more or less how LocalFilter works). You could also try changing how AdditiveSchwarz works, if you don't like the previous options.

The LocalFilter design was inherited from Ifpack.

@brian-kelley
Copy link
Contributor

@mhoemmen I think the best way to do this is to write a function that deep copies the 3-array CRS out of a LocalFilter. What is the fast way to get data out of a LocalFilter that you were referring to? I would like to use getLocalRowView() (or something equivalent) inside a functor but it's not a KOKKOS_INLINE_FUNCTION.

@mhoemmen
Copy link
Contributor

@brian-kelley Best thing would be to imitate LocalFilter's implementation. It works more or less like 4-array CSR. You could exploit the "row view" idea that KokkosSparse::CrsMatrix has.

@brian-kelley
Copy link
Contributor

I just submitted the PR for this (pull #1509). @mhoemmen, would you like me to add you as a reviewer?

@mhoemmen
Copy link
Contributor

@brian-kelley Sure, please do!

@jhux2
Copy link
Member

jhux2 commented Aug 5, 2017

@brian-kelley Just curious what the status of this is, and PR #1509.

@brian-kelley
Copy link
Contributor

@jhux2 PR #1509 is ready to be merged. I made all the changes and squashed everything into 1 commit. @srajama1 said last week that he would give it one final look and then merge it.

@jhux2
Copy link
Member

jhux2 commented Aug 5, 2017

@brian-kelley Great, thanks for the update!

@srajama1
Copy link
Contributor Author

srajama1 commented Aug 5, 2017

@brian-kelley : I started reviewing it and got distracted. I will get to it.

@brian-kelley
Copy link
Contributor

@srajama1 I'm not in any rush because I'm optimizing and profiling matrix addition right now :)

@jhux2
Copy link
Member

jhux2 commented Sep 11, 2017

@srajama1 Would you have a chance to look at this soon?

@srajama1
Copy link
Contributor Author

@jhux2 : Done one more review. Looks ready to be pushed.

@srajama1
Copy link
Contributor Author

@brian-kelley : Can we close this ?

@brian-kelley
Copy link
Contributor

@srajama1 Yes, I've merged this with the checkin script, so I'm closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants