Multithreading Propagators #166

MaxOstrowski · 2019-09-02T09:19:14Z

I have a problem running my propagator with several threads.

During ClingoControl::prepare my propagator's ::propagate and ::check methods are called.
In some examples I do use PropagateControl::add_literal to add new literals.
As far as I understand, these literals are thread specific. Since prepare is only executed on the master solver only there these literals are added. So it would be bad if these literals are thread specific and not copied to the other solvers. (We would have clauses talking over literals that are not in the new threads)

When starting the multi-threaded solve operation (ParallelSolve::solveParallel) the threads are copied, but it seems that the variables added via PropagateControl::add_literal are not copied.
While doing SharedContext::attach the master trail is forced on the new solver and it tries to assign these illegal variables (resulting in an array out of bounds exception (solver_types.h ll. 674)).

PS: Still need to work on a showcase example, but maybe you already have some thoughts.

rkaminsk · 2019-09-02T09:49:25Z

Looks like a problem with interface and implementation. With the current implementation, you should not add literals during initialization. On the other hand, the interface does not allow you to detect that the system is currently initializing.

Right now, it should be safe to add literals when propagate is called above the root level.

Some possible fixes:

Maybe it is enough to skip such literals when initializing the other solvers? If only the level 0 part of the trail is synced to other threads, then auxiliary literals can simply be skipped. (If higher levels are synced, literals implied by auxiliary literals could be turned into decision literals. Lets hope this does not happen. 😄)
We could provide a method to check if the system is in the initialization phase. This would require propagators to implement special cases during this phase. It is easy to implement but puts load on people implementing propagators.
We could register propagators after the solvers are ready. This is completely different from what is implemented at the moment and probably comes with all kinds of problems.

@BenKaufmann what do you think?

BenKaufmann · 2019-09-02T18:45:14Z

@rkaminsk AbstractSolver::addVariable() was indeed never meant to be called during initialization and its implementation in clasp (via. Solver::pushAuxVar()) actually has (documented) UB if called during initialization. I guess fix Nr. 1 should be possible to implement but I'll have to take a closer look
in order to check which parts of initialization currently actually rely on the fact that there are no volatile variables during init. Might take a couple of days...

rkaminsk · 2019-09-02T22:58:38Z

I put some code below where I think it should be okay to add variables. Maybe @MaxOstrowski can already work with this?

void Propagator::propagate(PropagateControl &ctl, LiteralSpan span) { 
    auto a = ctl.assignment();
    if (a.decision_level() > a.root_level()) {
        // here it should be safe to add variables
    }
    else if (a.decision_level() > 0) {
        // is it safe to add variables here?
    }
    else {
        // here it is not safe to add variables
        // because level 0 might be propagated during initialization
    }
}

BenKaufmann · 2019-09-07T18:10:39Z

@rkaminsk Your else if branch should be safe because initialization only ever happens on decision level 0 and while lookahead might temporarily add a decision level, external propagators are only called after lookahead has reached a fixpoint.

rkaminsk · 2019-09-07T18:37:54Z

Thanks. I am still a bit puzzled about the whole business of adding variables because I never wrote a propagator using this interface. Adding variables on level zero sounds like something one would want to do in a preprocessing step. Maybe adding a method to add a (static) variable in Propagate::init would be better here? If @MaxOstrowski describes a bit how he intends to add variables during propagation, I could try to come up with a small propagator as a test case for clingo.

PS: Since we have the interface, it would be great to also be able to add volatile variables during propagate on level 0. But I am just saying this for the sake of having an easy to use interface; I don't have actual use cases in mind.

BenKaufmann · 2019-09-07T20:03:30Z

I added a commit to clasp's dev branch that hopefully fixes this issue. To this end, I slightly postpone the first propagate/check call until after the problem is actually frozen. This way, propagators can now safely add aux variables even on decision level 0 and these aux variables are correctly treated as volatile and solver-local.

rkaminsk · 2019-09-07T21:23:03Z

I updated clasp in clingo's wip branch. Up-to-date binaries are available on anaconda in potassco's dev channel and on our university computers. @MaxOstrowski please test.

I was thinking to also add an add_literal to clingo's PropagateInit. @BenKaufmann can I use clasp_->ctx()->master->assignment()->addVar() before clasp_->prepare() to add a static variable?

BenKaufmann · 2019-09-08T08:22:52Z

@rkaminsk Nope. You cannot use clasp_->ctx()->master->assignment()->addVar() because a) the assignment() returns a ref to a const Assignment and b) it would result in a state where only a small fraction of relevant data structures know about the new variable.

To add a new static variable, you would have to call SharedContext::addVar(). However, for performance reasons, adding variables is actually a two-step process. Calling SharedContext::addVar() only adds a new entry to some primary data structure but does not yet make the variable known to the master solver. To do so, you have to call SharedContext::startAddConstraints() and this has to be done before you can add constraints over new variables.

rkaminsk · 2019-09-08T08:59:51Z

Can SharedContext::startAddConstraints() be called multiple times? What do you think is the best strategy to implement adding variables/constraints in PropagateInit?

Call startAddConstraint (possibly multiple times) before adding constraints if variables have been added.
Simply store the constraints and add them after all variables have been added.

BenKaufmann · 2019-09-08T14:09:00Z

Yes, SharedContext::startAddConstraints() can be called multiple times, so the easiest solution would be to call startAddConstraint() right after a new variable was added. I would go with approach 2, though, caching all constraints in one vector of null-terminated clauses.

rkaminsk · 2019-09-08T14:46:40Z

That should be easy enough to implement. The only annoying thing is that PropagateInit::add_clause no longer needs a Boolean as return then because whether the clause is conflicting is checked later. Maybe I will just put clauses over newly added literals in that list.

Edit: Or I add all clauses later just as you are suggesting and return simply return true. The return value can be removed in the next minor release. This has the advantage that the assignment does not change during propagator initialization. As a result, the interface will be easier to use.

rkaminsk · 2019-09-08T22:31:41Z

@BenKaufmann: I updated the interface in 45327d5. Can you please have a look if this makes sense?

MaxOstrowski · 2019-09-09T08:26:34Z

Hi guys, thanks a lot for your efforts. I will test this.

I'm not quite sure clingcon could avoid adding variables on decision level 0.
It is adding variables for integer variables if (there are also other cases) the assignment is complete.
So the program:

&sum {x} > 0.
&sum {x} <= 10.

contains only facts and therefore the assignment is complete. To enumerate solutions, clingcon introduced e.g. "x<=5" as a free variable to find an assignment for x during the "check" callback.
This case could be circumvented by doing so in the "init" callback. But if my theory atoms are not fact, but can be made fact by preprocessing, this could get problematic.

There exist also other cases, like propagation that adds clauses on not yet existing variables.

&sum {x} > 0.
&sum {x} <= 10.
&sum {y} > 0.
&sum {y} <= 10.
&sum {x} <= 3 :- a.
&sum {x+y} > 5.

If preprocessing makes "a" true, the initial propagation should add a clause
"not y <= 1" and therefore variables "y <= 1".
Hope this usecase helps to understand the problem.

MaxOstrowski · 2019-09-10T09:58:11Z

OK, I tried this new commit and my system is broken with the new behaviour even with one thread as I relied on some assumptions I think.

What triggers my assertions is the following scenario:
On the very first "check" callback in my propagator i do add a new variable "X" and also a unit asserting clause for it.
After calling control.propagate() to update clasp's propagation (and do some more propagation on my side) the new variable does not seem to be assigned. Is this the intended behaviour? Or have I done something wrong/misinterpreted stuff.

rkaminsk · 2019-09-10T11:22:42Z

@MaxOstrowski ideally you would provide a small example to reproduce. Per se the behavior does not sound wrong. If you add a unit clause, the solver might have to backtrack to actually incorporate it.

Also note that "no further calls on a control object or functions on the assignment should be called when the result of propagate is false." I did not explicitly write it, but the same should hold true if add_clause returns false.

MaxOstrowski · 2019-09-10T12:04:31Z

Sorry, my bad. I meant a clause with exactly one free literal. All other literals are false. The decision level (of everything) is 0. I will work on an example and get back to you. Thanks.

And yes, neither propagate nor add_clause returned false of course.

rkaminsk · 2019-09-10T12:19:37Z

Sorry, my bad. I meant a clause with exactly one free literal. All other literals are false. The decision level (of everything) is 0.

In this case it really depends on what is implemented in clasp. I could imagine that it is possible to delay propagation until the check call has returned. You could try with a small example yourself but I think we also need some info from @BenKaufmann.

BenKaufmann · 2019-09-10T13:08:15Z

Here is my guess what happens: The clauses you add are volatile. This means that clasp has to tag them with the step literal Ts for the current step s. A unit clause {x} therefore becomes {x, ~Ts} behind the scenes. Since we are logically still in "preparation mode" (i.e. solving of the current step has not yet started), Ts is not yet assumed to be true. Hence, {x ~Ts} is (not yet) a unit-resulting clause. This used to "work" (for some broken definition of work) because previously Ts was still equal to the unique True literal at this point, meaning that your volatile clause actually became static.

MaxOstrowski · 2019-09-10T13:21:11Z

I just finished my example test propagator for this. And this is exactly the behaviour. One additional literal (2) is introduced but not yet assigned. This is probably the step literal.
Thanks. I think I will have to make a list of minimal assumptions and check what is available in the API again redoing my initialization phase.
Still its hard to getting used to that adding clauses/variables behaves differently in "some" propagate/check calls. It would be really nice if a propagator could do some initial preparation (like init) where it can also use the features of clasps propagation engine.

rkaminsk · 2019-09-10T14:12:14Z

To me it makes total sense to not imply volatile literals on level zero because these literals are retracted at the end of a solving step. It sounds like you want to add literals that are not volatile so that you can add static clauses over them. You can do this during initialization in Propagator::init but you won't get full propagation at this point. It is delayed until after initialization. I don't know - maybe it is possible to request a unit propagation (excluding post propagation, i.e., calls to Propagator::propagate) during initialization.

MaxOstrowski · 2019-09-10T14:21:26Z

Exactly this would be my Plan B 🥈

I came up with a little test propagator and found a weird behaviour.
https:/MaxOstrowski/propagator-test

After adding a variable in the "init" callback i do get an error when trying to retrieve its truth value.
I do get why, but still an undocumented feature :)

A unit propagation method during the init would be awesome 👍 , as much more knowledge could be made static. Especially if you combine several propagators that only do a translational approach.

rkaminsk · 2019-09-10T14:37:39Z

After adding a variable in the "init" callback i do get an error when trying to retrieve its truth value.
I do get why, but still an undocumented feature :)

With the current implementation it is the right thing to happen. Let's see. If it is possible to do a unit propagation, then you could access its truth value after propagation.

MaxOstrowski · 2019-09-10T14:38:53Z

Sound good. But maybe unknown would be a better return (of false for is_true is_false) instead of on exception.
One open question would be if the propagation during initialization already triggers the just added watches of the current propagator.

rkaminsk · 2019-09-10T14:44:23Z

Sound good. But maybe unknown would be a better return (of false for is_true is_false) instead of on exception.

I rather think it's about documenting what is to happen.

One open question would be if the propagation during initialization already triggers the just added watches of the current propagator.

I think that the answer is to delay propagation until after initialization. That's why I said that post propagation should not be called. Just unit propagation. This will keep things simple.

MaxOstrowski · 2019-09-10T14:59:38Z

Approved. Would be cool if this would be possible. The problem with the watches could be that after calling propagate in init that the propagator will not get notified about the newly derived literals.
It has to check all literals afterwards to not miss any propagation, as it will never be notified about the decided literals.
You are right. I still have the opinion that simply returning false for is_true / is_false would be a much better behaviour. Currently clingo throws "*** ERROR: (clingo): Invalid literal".
But this literal is not invalid. It can even be used in add_clause, so why call it invalid. No ?

MaxOstrowski · 2019-09-12T11:48:17Z

Sorry for taking so long. After fixing some raid conditions I can approve that my tests now also run in multi-threading mode. Thanks.

rkaminsk · 2019-09-12T12:20:37Z

Thanks. Let's wait for @BenKaufmann feedback about the propagation before we close this. And no hurry. 😄

I am a bit skeptical because we want to do something here clasp has not been designed for. We want to have volatile variables as in propagation during the initialization phase; with the difference that those variables should become static in the end.

BenKaufmann · 2019-09-13T07:34:32Z

Could you please summarize which solution you would prefer in the end? From the clasp-side of things, I guess it would be relatively straightforward to postpone calls to Propagator::propagate() and Propagator::check() until either preprocessing is complete or even until the step-literal is assigned. The latter might be a little bit problematic because we could then end up with assignments that logically belong to the top-level but that only happen on decision level 1 (i.e. the decision level of the step literal). Adding static variables and clauses during preprocessing and doing unit propagation on them is no problem. However, you won't necessarily get unfounded set propagation because that is only enabled very late in the preprocessing phase (currently after Propagator::init()).

Anyhow, let's first clearly specify what is needed/wanted.

rkaminsk · 2019-09-13T07:57:26Z

I think @MaxOstrowski (correct me if I am wrong/or refine) want's to propagate the clauses he is adding during initialization right away. These clauses might refer to newly introduced variables. Each propagation might reduce the size of clingcon's domains reducing the number of order variables that have to be introduced while going on. Here some pseudo-code:

def init(self, init):
    for atom in init.theory_atoms:
        # how many clauses and literals are added depends on the assignment
        # and internal state of the propagator
        self.add_literals_clauses_and_watches(init, atom)
        # could propagate anything as long as it does not call back into this propagator
        # I was suggesting clasp's unit propagation
        init.propagate()

MaxOstrowski · 2019-09-13T08:09:00Z

@rkaminsk
Sounds about right. Although I would need some more time experimenting (first, without propagation) if every necessary function is available.
E.g. the backend() cant be used during the init callback, right ?
@BenKaufmann Let me check some things first before you start with an effort.
And yes, unfounded set propagation is not necessary, just basic unit propagation to get most of the static knowledge.

rkaminsk · 2019-09-13T08:13:23Z

The backend cannot be used during initialization, you are stuck with clauses (clasp's weight constraints could also be made available). If you want to use the backend, then we would need incremental propagation on the logic program. And this is a whole other topic.

MaxOstrowski · 2019-09-13T11:52:41Z

Actually the backend is not needed but adding weight constraints and atoms to the minimize statement would be all I need. Shall I will open a feature request?

rkaminsk · 2019-09-13T12:36:35Z

It should also be possible to add literals to the minimize constraint - during initialization but not while solving.

I'll open another issue to track the implementation progress once we came to a conclusion what to actually implement.

MaxOstrowski · 2019-09-13T12:46:37Z

Perfect!
During init I will have much more information about the program that I can profit from (although no propagation nor preprocessing was done yet if I'm right, but this is getting messy with step literals anyway).

Btw: would it make sense to automagically call ctl.cleanup() on every ground() call ? This way gringo can profit without any user interaction. Or is there any reason not to do so.

rkaminsk · 2019-09-13T12:53:45Z

To me it looks like you can already get started by using the backend. It allow you to add clauses, weight constraints, and minimize constraints. It just offers less preprocessing. I am saying this because it will probably take some time to implement the above. It should be possible to write the code so that it can be easily ported.

Btw: would it make sense to automagically call ctl.cleanup() on every ground() call ? This way gringo can profit without any user interaction. Or is there any reason not to do so.

It could be called after every solve call. It should be reasonably fast in practice (linear in the size of gringo's atom base). It is not called be automatically to leave it up to the user whether to simplify or not.

MaxOstrowski · 2019-09-13T13:03:34Z

Currently (the old clingcon version and still the current wip) does exactly this. It uses the backend to do all preprocessing, domain shrinking etc..
The disadvantage is, that I have merely any knowledge about the atoms (i still have to figure out why i do not even get facts but several different literals for the theory atoms).
Putting the preprocessing (currently done in the new prepare callback in "thingo" that you added) into the "init" and using the init interface to do all this would be great, as clasp has at least read the program and did very simply simplifications. So i can at least figure out the facts.
Changing the interface from backend to PropagateInit is no problem, as it is already wrapped in clingcon.

rkaminsk · 2019-09-13T13:19:07Z

The disadvantage is, that I have merely any knowledge about the atoms (i still have to figure out why i do not even get facts but several different literals for the theory atoms).

You might get facts. There is just no way to determine this with the current interface (without using a program observer). I can check if it is possible to add an is_fact property to the theory atoms just as with the symbolic atoms. Or maybe add an is_fact(literal) method to the control object. But I have to check if I have the information to do this (just because the solver says it is true does not mean it is a fact).

MaxOstrowski · 2019-09-13T13:55:57Z

If weight constraints and minimize atoms work during init i'm fine.
Everything that is fixed during init is fixed forever, right ?

rkaminsk · 2019-09-13T14:04:14Z

If weight constraints and minimize atoms work during init i'm fine.
Everything that is fixed during init is fixed forever, right ?

Yes, assignments on level 0 cannot be undone, and literals and constraints added during init are static.

MaxOstrowski · 2019-09-13T14:06:06Z

👯‍♀️

BenKaufmann · 2019-09-21T10:32:05Z

@rkaminsk I slightly extended the capability of Clasp::ClauseCreator in such a way that it now automatically calls SharedContext::startAddConstraints() if necessary. That is, whenever a clause to be added contains a problem variable that has not yet been added to the master solver. However, clauses still must not contain variables that were not previously added to the problem.

This should allow you to immediately add all clauses (thereby simplifying 45327d5) and also to add a propagate() function to ClingoPropagateInit that would do unit propagation. Unfortunately, there is a caveat though: propagation won't do unit-propagation if sat-preprocessing is enabled because in that case, clauses are initially not added to the solver but to a preprocessor, which does not support (incremental) propagation.

Finally, please note that while the new Clasp::ClauseCreator hides some complexity, one should still keep in mind that adding vars is a rather costly two-step process. Hence, users should still try to add variables in batches before adding clauses over them.

MaxOstrowski · 2019-09-23T11:43:56Z

OK, i found out that I only postponed the problem with the current fix.
The problem is the step literal which is added "behind the scenes".
Currently my propagator is adding clauses until it rans into an conflicting state (integer domain is empty). Therefore posting a constraint which should be conflicting. Due to the step literal, this conflict is not immediately recognized by clasp and propagation continues (which would make my propagator start propagating in an invalid state).
I could of course reject any propagation attempts in this invalid state and wait until clasp has assigned the step literal, detecting the conflict, calling "undo" to resolve the invalid state.

My new question is now regarding multi-shot solving.
Once clasp has detected the conflict and reverted the step literal a new solving process can start with a new step literal, right ? Given this new solving step, will I be able to redo the same propagation again (remember, the conflict was actually independent of the step literal), so will the change list (handed to "propagate") again contain all literals assigned on decision level 0 (the same assignment that I already processed the first time).

Hope that makes any sense ?

rkaminsk · 2019-12-06T13:39:16Z

This issue is already too convoluted. Please open a separate issue for unrelated questions. The development of the PropagateInit interface will be tracked in #183.

MaxOstrowski mentioned this issue Sep 2, 2019

Multi Threading Seqfault max/wip potassco/clingcon#10

Closed

rkaminsk added bug enhancement labels Sep 13, 2019

rkaminsk added this to the v5.4.1 milestone Sep 13, 2019

rkaminsk self-assigned this Sep 13, 2019

rkaminsk closed this as completed Dec 6, 2019

Multithreading Propagators #166

Multithreading Propagators #166

Comments

MaxOstrowski commented Sep 2, 2019

rkaminsk commented Sep 2, 2019 • edited Loading

BenKaufmann commented Sep 2, 2019

rkaminsk commented Sep 2, 2019

BenKaufmann commented Sep 7, 2019

rkaminsk commented Sep 7, 2019

BenKaufmann commented Sep 7, 2019

rkaminsk commented Sep 7, 2019

BenKaufmann commented Sep 8, 2019

rkaminsk commented Sep 8, 2019

BenKaufmann commented Sep 8, 2019

rkaminsk commented Sep 8, 2019 • edited Loading

rkaminsk commented Sep 8, 2019

MaxOstrowski commented Sep 9, 2019

MaxOstrowski commented Sep 10, 2019 • edited Loading

rkaminsk commented Sep 10, 2019

MaxOstrowski commented Sep 10, 2019 • edited Loading

rkaminsk commented Sep 10, 2019 • edited Loading

BenKaufmann commented Sep 10, 2019

MaxOstrowski commented Sep 10, 2019

rkaminsk commented Sep 10, 2019

MaxOstrowski commented Sep 10, 2019

rkaminsk commented Sep 10, 2019

MaxOstrowski commented Sep 10, 2019

rkaminsk commented Sep 10, 2019

MaxOstrowski commented Sep 10, 2019

MaxOstrowski commented Sep 12, 2019

rkaminsk commented Sep 12, 2019

BenKaufmann commented Sep 13, 2019

rkaminsk commented Sep 13, 2019 • edited Loading

MaxOstrowski commented Sep 13, 2019

rkaminsk commented Sep 13, 2019

MaxOstrowski commented Sep 13, 2019 • edited Loading

rkaminsk commented Sep 13, 2019 • edited Loading

MaxOstrowski commented Sep 13, 2019

rkaminsk commented Sep 13, 2019 • edited Loading

MaxOstrowski commented Sep 13, 2019 • edited Loading

rkaminsk commented Sep 13, 2019

MaxOstrowski commented Sep 13, 2019

rkaminsk commented Sep 13, 2019 • edited Loading

MaxOstrowski commented Sep 13, 2019

BenKaufmann commented Sep 21, 2019

MaxOstrowski commented Sep 23, 2019 • edited Loading

rkaminsk commented Dec 6, 2019

rkaminsk commented Sep 2, 2019 •

edited

Loading

rkaminsk commented Sep 8, 2019 •

edited

Loading

MaxOstrowski commented Sep 10, 2019 •

edited

Loading

MaxOstrowski commented Sep 10, 2019 •

edited

Loading

rkaminsk commented Sep 10, 2019 •

edited

Loading

rkaminsk commented Sep 13, 2019 •

edited

Loading

MaxOstrowski commented Sep 13, 2019 •

edited

Loading

rkaminsk commented Sep 13, 2019 •

edited

Loading

rkaminsk commented Sep 13, 2019 •

edited

Loading

MaxOstrowski commented Sep 13, 2019 •

edited

Loading

rkaminsk commented Sep 13, 2019 •

edited

Loading

MaxOstrowski commented Sep 23, 2019 •

edited

Loading