Add reduction using SAT and QBF solvers #407

notValord · 2024-05-28T22:53:23Z

Add implementation of two new types of reduction using solvers. The basic idea was the possibility of using solvers for reduction and whether it is possible to create minimal nondeterministic automata.

The solvers work with an automata representation using the CNF formula with variables representing its transitions, initial and final states. The solver then tries to find a solution that satisfies the given clause. The sets are updated until an equivalent automaton is found. The number of states is increasing from the default value making sure that the found automaton is minimal.

The reductions are working with structs SatStats and QbfStats, both inheriting from the base stat AutStats, which defines the number of states and symbols of the automaton, the two sets of words, and an output stream. Declarations can be found in include/mata/nfa/types.hh.

The reductions are working with the same interface as the other possible reduction algorithms which are parametrized using the ParameterMap.

The tests will be added shortly in future commits together with some small fixes. I was not sure with the proper location for the reduction implementation and its structures, as it is an another type of reduction the src/nfa/operations.cc seemed appropriate and the used structures were put into include/mata/nfa/types.hh but can be moved if necessary.

Add implementation of two new types of reduction using solvers. The basic idea was the possibility of using solvers for reduction and whether it is possible to create minimal nondeterministic automata. The solvers work with an automata representation using the CNF formula with variables representing its transitions, initial and final states. The solver then tries to find a solution that satisfies the given clause. The sets are updated until an equivalent automaton is found. The number of states is increasing from the default value making sure that the found automaton is minimal. The reductions are working with structs SatStats and QbfStats, both inheriting from the base stat AutStats, which defines the number of states and symbols of the automaton, the two sets of words, and an output stream. Declarations can be found in include/mata/nfa/types.hh. The reductions are working with the same interface as the other possible reduction algorithms which are parametrized using the ParameterMap.

ondrik · 2024-06-03T15:15:44Z

We discussed with @notValord to remove the linux binaries of SAT/QBF solvers and add them as requirements.

Adda0 · 2024-06-05T14:55:36Z

Hello. Thank you for the PR. One MacOS test seems to be failing. Otherwise, let me know when you finish with the changes so that the PR can be reviewed. In general, the implementation looks great to me after a quick glance.

as it is an another type of reduction the src/nfa/operations.cc seemed appropriate and the used structures were put into include/mata/nfa/types.hh but can be moved if necessary.

types.hh should be only for the general data types used by the user of Nfa. If the data structures are to be used only by the implementation of some algorithm, ideally the types should go as close to the place where they are to be used as possible. Such as in an anonymous namespace inside the source file with the algorithm. Or, if it is too much code, maybe creating some header files with internal namespaces such as plumbing, or internal might be appropriate. In this case, unless you plan on using these types elsewhere, I would just place everything in the anonymous namespace. Where do you think these types will be applicable / plan to use them?

Adda0 · 2024-06-05T14:59:46Z

We discussed with @notValord to remove the linux binaries of SAT/QBF solvers and add them as requirements.

Sounds good to me. However, I would go with optional requirements, ideally. However, we do not currently have a way to specify optional requirements. Maybe for now only a note in the comment when selecting the specific algorithm would be enough? That way, everyone can add complex algorithms with dependencies, but the main Mata implementation will remain as small and clean as possible. What do you and anyone else think about this?

ondrik · 2024-06-05T15:15:39Z

Sounds good to me. However, I would go with optional requirements, ideally. However, we do not currently have a way to specify optional requirements. Maybe for now only a note in the comment when selecting the specific algorithm would be enough? That way, everyone can add complex algorithms with dependencies, but the main Mata implementation will remain as small and clean as possible. What do you and anyone else think about this?

I agree with optional requirements and tests (maybe generating a warning instead of a failed test).

Adda0

I did a pass through the code. There are some comments about possible refactorization and some performance fixes. We should discuss these and address them before merging the PR.

Adda0 · 2024-06-24T08:02:39Z

include/mata/nfa/types.hh

+#define TSEY_NOT -3
+
+// define charactes used for SAT and QBF solvers
+#define SOL_EOL std::string("0\n")


Use only string literals here for the macros. Otherwise, the strings will have to be allocated for every single use of Mata during runtime. Furthermore, it would be ideal if the types were defined using constexpr char* SOL_EOL, or by using string view and string view literals.

Adda0 · 2024-06-24T08:08:24Z

include/mata/nfa/types.hh

@@ -48,6 +50,315 @@ struct Nfa; ///< A non-deterministic finite automaton.
 /// An epsilon symbol which is now defined as the maximal value of data type used for symbols.
 constexpr Symbol EPSILON = Limits::max_symbol;

+// defining indexes of logic operators used in an input vector for tseytin transformation
+#define TSEY_AND -1


I would go for clarity here and use TSEYTIN_{AND, OR, NOT}. Furthermore, to ensure proper usage, I would define these as constexpr int TSEYTIN_{AND,OR, NOT} = <value>;.

Are these indices and SOL characters to be used by the users of the library, or is it only necessary inside the library algorithms? If it is the latter, I would move these definitions to the appropriate files and enclosed them in as small scope as possible. For example, TSEY_AND is currently only used in src/nfa/operations.cc. Therefore, they should be defined in an anonymous namespace inside the file. Similarly for SOL_xxx.

Adda0 · 2024-06-24T10:21:10Z

include/mata/nfa/nfa.hh

+ * @param[in] output Output stream
+ * @return The first variable index that was not utilized
+*/
+size_t reduction_tseytin(const std::vector <int>& input, size_t max_index, std::ostream& output);


Should this method be called tseytin_transformation() instead of a reduction? Furthermore, if I am seeing right, this method does not use the underlying automaton in any way, so it can be a utility function, not even a method o
NFA, no? Stored somewhere in utils/ folder, probably.

Adda0 · 2024-06-24T10:25:02Z

include/mata/nfa/nfa.hh

@@ -582,10 +619,13 @@ inline bool is_included(const Nfa& smaller, const Nfa& bigger, const Alphabet* c
 * @param[in] alphabet Alphabet of both NFAs to compute with.
 * @param[in] params[ Optional parameters to control the equivalence check algorithm:
 * - "algorithm": "naive", "antichains" (Default: "antichains")
+ * @param[out] default_runs Optional parameter for intersection runs to return the found word


I think it is wasteful to pass by reference and have a statically initialized pair of pointers (which will get modified when the equivalence check gets run the second time -- therefore, they will no longer be the default runs). A pointer to a pair of runs should be used instead, default initialized to nullptr.

However, I think the global variable for default runs should be removed entirely and the parameter here should be a pointer default initialized to nullptr.

Adda0 · 2024-06-24T10:47:04Z

include/mata/nfa/nfa.hh

+ * @param[in] debug Flag for debug output as the reduction can take some time to get the info about the current status
+ * @return Reduced automaton
+*/
+Nfa reduce_sat(const Nfa &aut, const ParameterMap& params = {{"solver", "sat"}}, bool debug = false);


I think that both the specific reduction algorithms should be hidden behind the general reduce function. That is, they should not be in the public API for Nfa, and be only accessible in the anonymous namespace of operations.cc. Or, since we are starting to have way to many reduction algorithms, we could create a new source file specifically for reductions reduction.cc with potentially a new header file reduction.hh which can be included when one want to call the specific reduction algorithms. Creating a new source and header files seems like the best approach to me.

Adda0 · 2024-06-24T10:54:44Z

src/nfa/operations.cc

+size_t mata::nfa::SatStats::example_clauses(size_t max_index) {
+ size_t transitions_num = this->alpha_num * this->state_num * this->state_num;
+
+ for(auto word: this->accept) {


This will create a local copy of every vector. Use auto& when you want to create a reference to an auto deduced data type. This appear on multiple places around the code. If possible, try to find all of them and use references.

Adda0 · 2024-06-24T10:55:55Z

src/nfa/operations.cc

+ return max_index;
+}
+
+void mata::nfa::SatStats::recurse_tseytin_accept(const std::vector<int>& base, size_t state, Word word, const unsigned pos,


word should be passed by a reference, here, no? And similarly in other functions. If possible, try to have a look at the function declarations and see whether some arguments cannot be passed by reference.

Adda0 · 2024-06-24T11:01:21Z

include/mata/nfa/types.hh

+/**
+ * Base class for representing the input parameters for SAT and QBF reduction
+ */
+struct AutStats {


The implementation of all of these structs should be somewhere else. These types should not be used by the user of the library, correct? Therefore, I think creating a header file and a source file for reduction (reduction.cc and reduction.hh and moving code of all reductions and all these structs inside is a reasonable approach).

Adda0 requested review from Adda0 and ondrik May 29, 2024 07:00

notValord added 5 commits June 1, 2024 17:02

Fix static analysis issues and rework parsing of the solver output.

d14f538

Remove needless library.

bb25d13

Merge branch 'devel' into reduction_using_solvers to resolve conflicts.

35f77ce

Add unistd.h as it is required by compiler.

6c531b4

Add tests for reduction using solvers.

981aa05

Adda0 reviewed Jun 24, 2024

View reviewed changes

Adda0 assigned ondrik Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reduction using SAT and QBF solvers #407

Add reduction using SAT and QBF solvers #407

notValord commented May 28, 2024

ondrik commented Jun 3, 2024

Adda0 commented Jun 5, 2024

Adda0 commented Jun 5, 2024

ondrik commented Jun 5, 2024

Adda0 left a comment

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Adda0 Jun 24, 2024

Add reduction using SAT and QBF solvers #407

Are you sure you want to change the base?

Add reduction using SAT and QBF solvers #407

Conversation

notValord commented May 28, 2024

ondrik commented Jun 3, 2024

Adda0 commented Jun 5, 2024

Adda0 commented Jun 5, 2024

ondrik commented Jun 5, 2024

Adda0 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment