SAT solvers are something I would like to use more, but I feel like I simply never come across real world problems that they work well with. I've tried to use them for program/circuit synthesis, but the problem was intractable at any scale larger than a tiny proof-of-concept example.
Does anyone have good suggestions for what to use SAT solvers for?
I have used the Z3 solver(https://github.com/Z3Prover/z3) to optimize a job scheduling problem. Roughly, I have ~20k jobs, defined by N tasks (~1-50). A job is complete only if M tasks (typically 2-3) per job are done. Every task takes some variable amount of time to complete and can only be run during specific windows. Given the equipment can only perform P tasks in parallel, what is the optimal configuration of tasks (~500k) that will allow us to complete the most jobs?
The search space for this problem is enormous, but Z3 will produce an impressive result within a few minutes.
amazing, I tried doing this recently but I guess my modeling was not good enough and it would take hours to find the solution with a couple of dozen variables. Do you have any pointers on how to much these types of problems with a SMT/SAT solver efficiently?
The documentation and API quirks of z3 also led me down some performance sinks before I was able to tackle the larger models. Not a Z3/SMT expert, but what worked for me was to model every time tick as the sum of all tasks that could be performed at that instant. Set a constraint on each tick that sets the sum of these tasks to be <= P. Add the book-keeping on the task-job relationships and that was it.
Using the Python bindings, the biggest performance impact I discovered was ensuring that my jobs and tasks were represented as z3.Bool and not z3.Int (requiring a roundabout way to do a summation on z3.Bools). Something like:
tasks_this_tick = [(z3.Bool(task_id), 1) for task_id in current_tasks]
model.add(z3.PbLe(tasks_this_tick, max_tasks_per_tick + 1)
At a local group we have weekly meetings where we cook dinner. Two people cook, and two people do the dishes.
I used a SAT solver (z3) to find optimal schedules given that people want the tasks to be distributed fairly, but also given other secondary criteria such as mixing up which pairs cook together, prevent being assigned a task two weeks in a row or getting the same task twice in a row, etc.
It is pretty easy to come up with an integer space in order to simulate week days, the concept of consecutiveness and homogeneous distribution.
However, as soon as we get custom rules and exceptions things start to become impossible and we have to start inventing a best effort solution which turns into minimization or maximization of each human variable.
At this point some humans with higher status want to be favoured and you get a tug of war instead of an sat solver problem where nothing you suggest pleases anyone because they have a life and can't afford to tell you every little detail that interests them and they go back to manually picking their dates.
SAT are concerned with satisfying constraints, so you get feasible solutions. A feasible solution merely satisfies constraints, and does not take objectives into account.
The optimization-equivalent is an Integer Program (IP), which gives you optimal solutions for a given objective function while satisfying all constraints. Most complex scheduling problems are solved as IPs or MIPS (Mixed Integer Programs).
Priorities can be modeled as weights. Even if everybody's preferences cannot be met, you can arrive at some sort of weighted compromise (either in a least-squares sense, or some user-defined distance metric)
The best MIP solvers today (e.g. CPLEX, Gurobi, Xpress) solve problems with 100k variables without breaking a sweat (caveat: they do not handle nonconvexities, but good modelers know how to develop linear/convex formulations). They are used in applications ranging from hospital scheduling to airline scheduling to oil refinery scheduling.
In my experience computer scientists typically know about SAT solvers but only a small subset know about MIP solvers. Knowing about the latter expands one's horizons on what's possible.
Cbc [1] is modern, free and is competitive for most non-complex problems like timetabling (say less than 100k variables). (Cbc is much better than GNU Linear Programming Kit -- GLPK -- which uses outdated algorithms and is not competitive). Cbc can also be called from Excel via OpenSolver.
Scip [2] is the most performant "non-commercial" solver, but is only free for non-commercial uses. Commercial uses require licensing.
As for writing MIPs, this FICO MIP formulation guide [3] is extremely well written. Modeling MIPs is something of an art, and this doc captures most of the common formulations.
I didn't benchmark it and can't give a speed comparison, but I had success solving a pretty big optimization problem with GLOP, which I beleive uses or-tools.
The entire US power grid commits generators and dispatches them using MIP solvers. This is a pretty enormous problem with hard time constraints. We're talking about millions of constraints.
Edit: well to be more specific, commitment is done with MIP solvers (difficult problem) and dispatch is done with LP solvers (easy and fast).
Human capriciousness is indeed hard to model, so one solution is not do it. In timetabling applications, the algorithm provides a level of impartiality so even if some constituents don't get what they want, you can always blame the algorithm -- "it tried, but it had to satisfy other constraints".
This removes some (but not all) of the politics behind timetabling. Influential persons may still try to wield their political power to bend the algorithm toward favoring them (e.g. in algorithmic terms, this could mean boosting their weight from 0.5 to 0.8 to beat out competition) but even in this compromised state, optimal timetabling is still better than doing it manually -- the optimization algorithm will at least try to maximize remaining preferences within the remainder degrees-of-freedom.
Timetabling is an inherently political process anywhere (if one has ever done timetable for high school teachers....), but optimization algorithms do bring a level of impartiality to the process. Also, if folks aren't explicit about their preferences, or change their minds, it's no fault of the algorithm.
In corporate deployment of a SAT or even a linear solver one of the bigger challenges is convincing people to solve the SAT/linear problem and then stop.
I'd guess that most corporate deployments of a solver are (1) solve the easy part (2) add one extension to try and deal with a difficult part (3) deployment finishes because cost of a 2nd extension is too high, either computationally or in development time.
I think there is a big untapped corporate demand for evolutionary optimisers. They scale a bit closer to problem difficulty as measured by humans and people get dazzled by the idea of finding an optimal solution to a bad model rather than an OK solution to an OK model of reality which is more flexible.
I've always wanted to build a similar system for assigning chores around the house based on everyone "bidding" on how much they see the effort of each chore being. In theory, one should be able to come up with an assignment where everyone believes they're doing less than everyone else, by their own metric of work effort.
Applying computer science to everyday life is so rare but lovely.
These can be deployed against the Eternity II puzzle. I doubt any SAT solver will find a complete solution in our lifetimes, but a near solution would be satisfying.
I have applied SATs with some success to generative models where sampling methods would not work well, e.g. if you have many soft and hard constraints. Another alternative is to use neural guides.
I've done some program synthesis and equivalence proving with them. It was for sure trivial to express and nontrivial to express in a way that would actually get to a solution. I know of no general shortcuts for avoiding the part where it performs terribly for nonobvious reasons, but then I am not a SAT solver expert.
I've also done a bit of work on concolic execution, which has yielded interesting insights but not useful results. I'll probably write something up for hn if/when it does yield results.
As others have said, my first real use of it was for scheduling problems. It works but isn't really better than a hand rolled naive solver for the scale of problem I was looking at.
They seem to be good for proving obvious things about cryptosystems. It's very interesting to see how quickly block length becomes a security parameter for things like Speck.
Overall I like them and they clearly have potent uses, but... often I find myself shrugging, accepting some constraint, and using the more obvious approach. It may be that I just seldom have to solve a hard enough problem with full generality.
Sudoku may be a poor example for illustrating some of the overarching points of this series of blog posts, which I take to be that modern SAT solvers are, as someone described them in a previous thread, "little diamonds of engineering", that they embody decades of research, that you can exploit their power cheaply and easily, and that it might take significant effort to beat them if you have a non-trivial problem.
Since specialized solvers do thrash SAT-based solvers for conventional Sudoku, the author focuses the example less on absolute performance and more on simplicity, rapid prototyping, and comparison to casual non-SAT implementations. But conventional Sudoku is a small problem, and in most cases the SAT-based solver makes so few decisions that it doesn't benefit much from CDCL. Consider here[1] the comparison to JCZSolve(ZSolver) that the author mentions in part 1 of the series:
For 17-clue puzzles (which are generally very easy), JCZSolve is 50x as fast as Minisat, and it makes on average ~1.9 decisions per puzzle compared to Minisat's ~3.0. But if you look at the hardest dataset JCZSolve is only 10x as fast as Minisat, and it makes on average ~365 decisions per puzzle compared to Minisat's ~121. I'm not aware of a highly specialized and optimized Solver for 16x16 clue or larger Sudoku, but given this trend my guess is that it would take a lot of effort to write one that's faster than what you get with Minisat for free.
SAT solvers are starting to be used as tooling for functional languages - Liquid Haskell uses one for refinement types[1], and Djinn uses one for suggesting functions given a type[2]. Similarly to Djinn, Edwin Brady gave a presentation on using an SMT solver to do live code suggestions/implementation inference in the successor to Idris[3].
If you’re looking for a higher-level interface to SAT, SMT, and general constraint solvers, I can recommend MiniZinc: https://www.minizinc.org/index.html
Whenever I read about SAT solvers I think of Prolog and Minikanren and friends. Is there a relationship there? Do they use similar algorithms? I know that backtracking (mentioned in the article) is a thing in Prolog at least.
Yes, there are strong connections: In Prolog terminology, SAT is called CLP(B), constraint logic programming over Boolean variables, and several Prolog systems provide dedicated libraries to solve SAT instances efficiently.
For instance, SICStus Prolog ships with a nice CLP(B) solver:
This particular solver uses Binary Decision Diagrams (BDDs) to model and solve SAT instances.
CLP(B) blends in seamlessly with Prolog in the sense that Prolog variables and predicates are used to ask questions and obtain answers about satisfiability. For instance, we can formulate a Prolog query that asks: “Are there any X and Y such that the (X∧Y)∨X is satisfiable?”
?- sat(X*Y + X).
X = 1,
sat(Y=:=Y).
The answer tells us that yes, the expression is satisfiable, if and only if X = 1. Moreover, we see from the answer that the truth value of the expression is then independent of Y, because Y is only constrained to be a Boolean variable (by the tautology Y⇔Y) and not involved in any other constraints.
To obtain concrete assignments that make the expression true, we can use the library predicate labeling/1. On backtracking, all solutions are enumerated:
?- sat(X*Y + X), labeling([X,Y]).
X = 1, Y = 0 ;
X = Y, Y = 1.
There are many different ways and approaches to implement SAT solvers, and this diversity is also reflected in Prolog implementations that provide them. For instance, in GNU Prolog, SAT solving is provided as a special case of integer constraints:
The paper shows that 20 lines of Prolog code suffice to implement an elegant solver for—as the authors put it—"solving some interesting, albeit modest, SAT instances".
However, SAT is a quite low-level way to encode combinatorial tasks, and for this reason, CLP(FD/ℤ), i.e., constraint logic programming over finite domains/integers is typically used instead of SAT when solving combinatorial tasks with Prolog.
Conceptually, we can think of CLP(ℤ) as a generalization of SAT solving, where variables can range over the entire integers instead of only 0 and 1. Strong and fast constraint solvers over integers are a major attraction of commercial Prolog systems, and often the key reason for buying one.
From a logical perspective, propositional logic can be regarded as a special case of first-order predicate logic, so a language rooted in predicate logic, like Prolog, is also related to SAT solving in this way.
The programming language Mercury which is a functional, predicate based, moded language uses something like a precursor to a sat solver to "run" the program. As in you can ask for the input of a function to make your program true.
I imagine all sorts of things can be done like this, and the sat solver is useful for verifying correctness.
This is an interesting read, but I'd be interested in a more large-scale benchmark. Right now it could be possible that the SAT solver is just better optimized for the particular platform, instead of the SAT solver being superior algorithm-wise.
Cracking product keys for... Uh... Research. You reverse engineer the binary, enter the conditions into an sat solver and boom. But I don't think most keys are implemented this way these days.
Also if you want cool, check out uc Santa Barbara's team for the darpa cyber challenge a few years ago for an autonomous hacking system. Their framework, angr, makes use a lot of sat solvers although I didn't quite understand everything when I read their technical docs
Based on my experience, most key requiring programs use public key crypto and pin their public key in the binary to verify the signature, which is the key.
I've seen some that use public key crypto, but mostly they do not because even the smallest secure cryptosystems have signatures on the order of at least 20 bytes (and more conventional choices more like 256-bit ECDSA... 64 bytes)-- which, after conversion to typeable characters, are a bit unwieldy.
Does anyone have good suggestions for what to use SAT solvers for?