An approximate Max#SAT solver by Daniel Fremont, Markus Rabe, and Sanjit Seshia. For descriptions of the Max#SAT problem and our algorithm, see our AAAI 2017 paper here.
The solver consists of two Python scripts, which should be placed in the same folder.
- maxcount.py: main solver script
- selfcomposition.py: utility script to construct self-compositions
Your system must have an installation of the CryptoMiniSat Python bindings, available here.
The solver expects a binary scalmc present in the same directory implementing the UniGen2 algorithm of Chakraborty, Fremont, Meel, Seshia, and Vardi and the ApproxMC2 algorithm of Chakraborty, Meel, and Vardi. The particular implementation used in our experiments is based on a prototype by Mate Soos and Kuldeep Meel which is pending publication. In the mean time we provide binaries for several platforms in the scalmc-binaries folder. If you cannot use any of the available binaries, or experience any other problems, please contact us.
Alternatively, older implementations of UniGen2 and ApproxMC2 are available here and here (although not integrated into a single binary).
The benchmarks folder contains benchmarks from our AAAI 2017 paper. To test that MaxCount is set up correctly, you can try for example:
python maxcount.py benchmarks/SyGuS/IssueServiceImpl.dimacs 2
This particular benchmark should take around 30 seconds and return an estimate of 1.964 x 2^28
(for reproducibility, MaxCount is deterministic; you can change the random seed with the option --seed
).
MaxCount accepts Max#SAT problems in a simple extension of the standard DIMACS CNF format.
c max 1 2 3 4 5 0
c ind 6 7 12 0
p cnf 17 18
3 1 0
4 -1 0
Lines beginning c max
specify a list of maximization variables, terminated with a 0
.
Lines beginning c ind
likewise specify a set of counting/summation variables.
All variables not mentioned in such lines are existentially quantified.
Running MaxCount is straightforward. The basic command is
python maxcount.py formula.cnf k
where formula.cnf
is a Max#SAT problem (in the format described above) and k
is the number of copies of the formula to use in the self-composition (see our paper; in general, as k
increases from 0 runtime increases and better results are obtained).
The solver outputs the estimated max-count, the corresponding witness (assignment to the maximization variables), and probabilistic bounds on the max-count.
v -1 2 3 4 -5 0
c Estimated max-count: 1.875 x 2^8
c Max-count is <= 1.72649 x 2^12 with probability >= 0.6
c Max-count is >= 1.875 x 2^7 with probability >= 0.9801
MaxCount accepts a variety of options, which can be viewed using the option -h
. Many are primarily used for tuning the performance of the solver (see below), but three are used for specifying the desired tolerance and confidence of the solver's output:
--countingTolerance
controls how accurately each witness' count is estimated; assumingk
is large enough that good witnesses are being sampled, further improvements in accuracy require decreasing this parameter--upperBoundConfidence
specifies the confidence desired for the max-count upper bound; increasing this does not affect the runtime, but weakens the bound (i.e. the bound MaxCount returns is the strongest one true with at least the desired confidence given the outcome of the sampling and counting)--lowerBoundConfidence
likewise specifies the confidence desired for the max-count lower bound; increasing this can increase the runtime
MaxCount may return bounds with higher confidences than those specified by the options above, for example when a count can be computed exactly.
MaxCount exposes many parameters useful in optimizing solver performance on particular types of problems. Adjusting these parameters is always safe: the probabilistic bounds output by MaxCount are computed in such a way that they are correct for any values of the parameters. Thus you are free to experiment with the settings to find which values give you the strongest bounds or the best performance.
Sampling is done using UniGen2, unless k
is zero (in which case we sample uniformly at random from all assignments to the maximization variables).
The number of samples to generate is specified by the option --samples
.
Increasing the number of samples may yield better results when k
is small; conversely, if k
is large enough then good results may be obtained with only a few samples.
The sampling tolerance may be adjusted with the option --samplingKappa
(see the UniGen2 paper for the definition of kappa).
Using a looser tolerance (larger kappa) may improve sampling performance at the cost of decreased accuracy (unless the number of samples is correspondingly increased).
A variety of techniques are used for projected counting:
- brute force: testing every possible assignment to the counting variables for satisfiability
- Monte Carlo sampling: testing many random assignments
- enumeration: enumerating satisfying assignments up to a threshold
- hashing: ApproxMC2
The first three techniques are simpler and faster than hashing, and brute force and enumeration can yield exact counts. Thus MaxCount attempts them when possible, as this can often improve performance and accuracy while typically adding little overhead. The precise procedure for deciding which techniques to use is as follows:
- We check if the formula is UNSAT, returning immediately if so.
- If the number of samples that would be used for Monte Carlo sampling is larger than the total number of assignments to the counting variables, we instead brute force test every assignment for satisfiability, obtaining an exact count.
- Otherwise we do Monte Carlo sampling, obtaining an estimate of the solution density.
- If the density is low enough that the count might be below the enumeration threshold, we try enumeration (thereby possibly obtaining an exact count).
- If the density is high enough that it yields a count estimate accurate to within the required multiplicative error, we return that estimate.
- Otherwise we count using hashing.
The number of Monte Carlo samples and the enumeration threshold to use can be specified by the options --monteCarloSamples
and --enumerationThreshold
.
These techniques can be disabled entirely by setting the corresponding options to zero (if both are disabled, MaxCount will always use hashing).