Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjoint sensitivities, fix #876 #917

Closed
wants to merge 20 commits into from

Conversation

paszkow
Copy link
Contributor

@paszkow paszkow commented Jan 21, 2020

Computing adjoint sensitivities by solving a linear system for the steady-state solution of the ode problem.

Łukasz Paszkowski added 6 commits January 19, 2020 14:12
Let both methods have all parameters required by model specified
implementation similarily as it is done for fJ and fJSparse methods.
The NewtonSolver contains a prepareLinearSystemB method that
retrieves and passes the Jacobian JB. This allows solving linear
systems JB*x = rhs.
When a linear system is prepared construct a Jacobian and use it
directly in matrix-vector multiplication instead of mutliple calls
to Model::fJv.
Avoid backwards integration to compute sensitivities when handled
with a steady-state solution. It is enough to solve a linear system.
@FFroehlich
Copy link
Member

This effectively takes away the possibility to compute steady-state adjoint sensitivities for systems with conservation laws, i.e., where the jacobian is singular. I would prefer if this was an optional feature.

@paszkow
Copy link
Contributor Author

paszkow commented Jan 22, 2020

Sure. I was thinking of adding an additional flag to the solver, the same way the preequilibration is set, with the feature being switched off by default. Do you have in mind a suitable name for it you would prefer?

@paulstapor
Copy link
Contributor

paulstapor commented Jan 22, 2020

This effectively takes away the possibility to compute steady-state adjoint sensitivities for systems with conservation laws, i.e., where the jacobian is singular. I would prefer if this was an optional feature.

One could, in a second step, add an exception-handling, like this is done in forward-steady-state computation:

  1. Try to factorize the Jacobian
  2. If not possible, integrate the adjoint state as linear ODE with fixed matrix until steady-state tolerance reached, while carrying out a quadrature over p.

Then, option 2 would just replace option 1 and still be working for singular Jacobians. Would that be a suitable next step? (Not for this PR, but for a later one)?
Right now, the flag is the best solution. Later, this flag can switch between option 1 and 2.
This way, Adjoint-steady-state computation may just circumvent the need of storing and interpolating the forward trajectory, which would still save some time...

I think we can save quite some time by this shortcut. We have some applications of this form now, mostly large-scale systems. And the steady-state case has simply quite some structure, which can still be exploited...

@paulstapor paulstapor self-requested a review January 24, 2020 11:50
Copy link
Contributor

@paulstapor paulstapor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Thanks for implementing, but some changes still have to be done.
The flag to switch on/off the new functionality would be nice... Something like NewtonSolverBackward -> true/false might do it...

solver->runB(model->t0());
solver->writeSolutionB(&t, xB, dxB, xQB, this->which);
solver->getDiagnosisB(0, rdata, this->which);
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shouldn't be an else here: If we put an else, this means that we can have either steady-state or time-course data, but not both (which would ideally be possible).
Suggestion: keep the if above, and check for
(it > 0 && model->getTimepoint(it - 1) > model->t0())
instead of the else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember we talked about it. I am not convinced you can do that or I am missing something. To integrate backwards from t_{N-1} point you need an initial condition and you do not know what the value of p at this point is. You only have its value at t_N.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of p at that point is zero before the update, and at the point it gets updated by dJydx, as usual.
Imagine we integrate not from steadystate but a "very late" timepoint: whatever p was there, the time interval from the "very late" to the prelast timepoint is "almost infinte". So enough time for p to equilibrate to its steady-state, which is (necessarily) at 0.

@@ -145,6 +155,52 @@ realtype BackwardProblem::getTnext(std::vector<realtype> const& troot,
return rdata->ts[it];
}

void BackwardProblem::computeIntegralForSteadyState()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally like to have this routine as part of steadystateproblem.cpp rather than backwardproblem.cpp,
maybe as workSteadyStateProblemBackward(). This way, we don't need to include newton_solver.h and keep everything which is not somehow concerned with usual ODE-integration out of backwardproblem.

Comment on lines 177 to 196
for (int ip=0; ip<model->nplist(); ++ip)
{
if (model->ndxdotdp_explicit > 0) {
auto col = model->dxdotdp_explicit.indexptrs();
auto row = model->dxdotdp_explicit.indexvals();
auto data_ptr = model->dxdotdp_explicit.data();
for (sunindextype iCol = col[model->plist(ip)];
iCol < col[model->plist(ip) + 1]; ++iCol)
xQB[ip] += xB[row[iCol]] * data_ptr[iCol];
}

if (model->ndxdotdp_implicit > 0) {
auto col = model->dxdotdp_implicit.indexptrs();
auto row = model->dxdotdp_implicit.indexvals();
auto data_ptr = model->dxdotdp_implicit.data();
for (sunindextype iCol = col[model->plist(ip)];
iCol < col[model->plist(ip) + 1]; ++iCol)
xQB[ip] += xB[row[iCol]] * data_ptr[iCol];
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified.

        if (model->ndxdotdp_explicit > 0)
            model->dxdotdp_explicit.multiply(xQB, xB, plist_, False);
        if (model->ndxdotdp_implicit > 0)
            model->dxdotdp_implicit.multiply(xQB, xB, plist_, false);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely.

else
{
for (int ip=0; ip<model->nplist(); ++ip)
xQB[ip] = N_VDotProd(xB.getNVector(), model->dxdotdp.getNVector(ip));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

*
* Valid only if the computed solution is a steady-state.
*/
void computeIntegralForSteadyState();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally move to steadystateproblem.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check it out. I need ExpData to generate the rhs of the problem or pass dJydy from the forward problem. That is why I place that in backwardproblem.h. The right-hand side of the problem is constructed for free in handleDataPointB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point!
Yet, philosophy-wise, I'd prefer to include ExpData into SteadyStateProblem rather than mixing the steadystate and the adjoint part...

* @param xBdot Vector with the adjoint right hand side (unused)
* @param JB dense matrix to which values of the jacobian will be written
*/
virtual void fJB(const realtype t, realtype cj, const AmiVector &x,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what causes the CI errors...
Not yet sure what the problem is precisely. Will have a closer look tomorrow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done @paszkow . If you accept my pull request into your fork, then the CI errors should be fixed now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will. I already have all your comments implemented. Need to double-check the correctness of the results I am getting before pushing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great if you could add these tests as unittests for the proposed, if you need help with that, please let me know!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea. Will do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will. I already have all your comments implemented. Need to double-check the correctness of the results I am getting before pushing.

At least for the small steady-state example problem, things looked correct when I tested them...

@codecov
Copy link

codecov bot commented Feb 13, 2020

Codecov Report

Merging #917 into develop will decrease coverage by 6.23%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #917      +/-   ##
===========================================
- Coverage    71.85%   65.62%   -6.24%     
===========================================
  Files           52       17      -35     
  Lines         8431     2906    -5525     
===========================================
- Hits          6058     1907    -4151     
+ Misses        2373      999    -1374
Flag Coverage Δ
#cpp ?
#python 65.62% <ø> (-3.59%) ⬇️
Impacted Files Coverage Δ
python/amici/sbml_import.py 83.71% <0%> (-0.36%) ⬇️
python/amici/ode_export.py 93.57% <0%> (-0.27%) ⬇️
python/amici/petab_objective.py 0% <0%> (ø) ⬆️
python/amici/numpy.py 81.69% <0%> (ø) ⬆️
src/model.cpp
include/amici/solver_idas.h
src/vector.cpp
src/rdata.cpp
src/spline.cpp
include/amici/solver_cvodes.h
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d78010f...f8fbf3f. Read the comment docs.

solver->runB(model->t0());
solver->writeSolutionB(&t, xB, dxB, xQB, this->which);
solver->getDiagnosisB(0, rdata, this->which);
if (it>=0 && model->getTimepoint(it) > model->t0())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does model->getTimepoint(it) do if it == -1? Does it return false?
It boils down to some amivector.at(-1), which should be failsafe...
You have tested this, @paszkow ? If it works, I'm fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timepoints are stored in std::vector which at method accepts values of size_t type. I am not a fan of implicit conversions because you never know what may happen. Safer to keep it this way.

xQB[ip] = N_VDotProd(rhs.getNVector(), model->dxdotdp.getNVector(ip));
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice!
Really thanks for your contribution here, @paszkow ! :)

Copy link
Contributor

@paulstapor paulstapor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good!
I would like to double-check stuff. We should also implement some tests... Beyond that, I'm very happy...

@paszkow
Copy link
Contributor Author

paszkow commented Feb 17, 2020

Concerning the unittests that need to be written. I have two questions.

  1. I am getting different results of the gradient depending on the timepoints I set. Setting {1e6, 1e8} as one case and {1e8} as a second test case, result in different gradient values regardless of the sensitivity method. Is it correct?

  2. Recently, I have started obtaining some cvode errors when I set timepoints to {inf} and using adjoint sensitivity analysis with backward integration. I have the same issue on the master branch as well. I am pretty sure I did not see this issue earlier.

[Warning] AMICI:mex:CVODEA:CVodeB:OTHER: AMICI ERROR: in module CVODEA in function CVodeB : The initial time tB0 for problem 0 is outside the interval over which the forward problem was solved. 
[Warning] AMICI:mex:simulation: AMICI backward simulation failed when trying to solve until t = 0.000000 (see message above):
AMICI failed to integrate the backward problem

@paulstapor
Copy link
Contributor

paulstapor commented Feb 17, 2020

Hi

  1. I am getting different results of the gradient depending on the timepoints I set. Setting {1e6, 1e8} as one case and {1e8} as a second test case, result in different gradient values regardless of the sensitivity method. Is it correct?

If I get you correct, it makes perfect sense: The gradient is summed over timepoints. Hence, if you have timepoints at 1e6 and 1e8, you have twice as many (noisy) timepoints as if you only use 1e8.
As you probably generate data from simulations with random noise, it's not to be expected that you get any meaningful relation between the gradient contribution from 1e6 and from 1e8.
If you used exactly the same data point for both timepoints, you should have approximately twice the gradient for the two-point-dataset as for the one-point dataset.

  1. Recently, I have started obtaining some cvode errors when I set timepoints to {inf} and using adjoint sensitivity analysis with backward integration. I have the same issue on the master branch as well. I am pretty sure I did not see this issue earlier.

Yep... does that happen if solver->getNewtonSolverBackward() == false?
As I said, I don't really understand what Amici did in that case beforehand...
But I have a rough idea, why it comes up (I'm just surprised it didn't happen earlier...):
Probably it's due to Amici not calling CVodeF, but doing a Newton solve, and then calling CVodeB.
What happens on master or develop currently?

@paszkow
Copy link
Contributor Author

paszkow commented Feb 17, 2020

Exactly with this setting solver->getNewtonSolverBackward() == false. On other branches, I have the same situation. There, of course, you do not have this change yet so the original backward integration is used. At first, I thought I broke something down, but then checked on other branches and had the same effect. For testing I use model_steadystate.

@paszkow
Copy link
Contributor Author

paszkow commented Feb 17, 2020

The gradient is summed over timepoints.

Yes, now it is clear. I missed that.

@paulstapor
Copy link
Contributor

Exactly with this setting solver->getNewtonSolverBackward() == false. On other branches, I have the same situation. There, of course, you do not have this change yet so the original backward integration is used. At first, I thought I broke something down, but then checked on other branches and had the same effect. For testing I use model_steadystate.

Still, I don't really get it:
Also on the other branches, if the timepoint is inf, it should not have called CVodeF... At least, I don't see how and where...

@paszkow
Copy link
Contributor Author

paszkow commented Feb 17, 2020

Already clarified thanks to @paulstapor.

@FFroehlich
Copy link
Member

Codacy Here is an overview of what got changed by this pull request:

Clones added
============
- src/newton_solver.cpp  2
- include/amici/model_dae.h  2
- include/amici/model_ode.h  2
- src/steadystateproblem.cpp  2
         

See the complete overview on Codacy

@paulstapor paulstapor changed the title Adjoint sensitivities Adjoint sensitivities, fix #876 May 6, 2020
@dweindl dweindl marked this pull request as draft May 30, 2020 13:46
@paulstapor paulstapor closed this Jun 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants