Hacker News new | past | comments | ask | show | jobs | submit login
Modern SAT solvers: fast, neat and underused (2018) (codingnest.com)
226 points by weird_user on May 26, 2023 | hide | past | favorite | 90 comments




Love this article and the push to build awareness of what modern SAT solvers can do.

It's worth mentioning that there are higher level abstractions that are far more accessible than SAT. If I were teaching a course on this, I would start with either Answer Set Programming (ASP) or Satisfiability Modulo Theories (SMT). The most widely used solvers for those are clingo [0] and Z3 [1]:

With ASP, you write in a much clearer Prolog-like syntax that does not require nearly as much encoding effort as your typical SAT problem. Z3 is similar -- you can code up problems in a simple Python API, or write them in the smtlib language.

Both of these make it easy to add various types of optimization, constraints, etc. to your problem, and they're much better as modeling languages than straight SAT. Underneath, they have solvers that leverage all the modern CDCL tricks.

We wrote up a paper [2] on how to formulate a modern dependency solver in ASP; it's helped tremendously for adding new types of features like options, variants, and complex compiler/arch dependencies to Spack [3]. You could not get good solutions to some of these problems without a capable and expressive solver.

[0] https://github.com/potassco/clingo

[1] https://github.com/Z3Prover/z3

[2] https://arxiv.org/abs/2210.08404, https://dl.acm.org/doi/abs/10.5555/3571885.3571931

[3] https://github.com/spack/spack


There are also Constraint Programming solvers (some SAT based, some not) and (Mixed) Integer Programming solvers (not SAT based).

Each "school" excels at different types of problems. ASP for modelling a knowledge-base and running queries on it, CP for discrete optimization problems or for all-solution search, SMT for formal verification and proofs, MIP for optimization of (mostly) continuous variables.

Modern solvers in these "schools" can do things traditionally meant for other "schools". Z3 can do optimization, clingo can include CP-style constraints with clingcon, some MIP solvers can find all solutions, etc.

All in all, this type of "classical" AI is super interesting and I hope the hype on LLMs doesn't suck up all the funding that would go to research in this area.


Plug for my Constraint Solver if anyone wants a simple example https://github.com/lifebeyondfife/Decider


The work on [2] is fascinating to me, both because of the problem domain and as a case study on the effective application of ASP. I will be reading this paper carefully to pore over the details.


Do you have a recommendation for how to get into ASP? I've read the clingo docs, but it has never clicked.


I read Potassco's Answer Set Solving in Practice book [0] but it's pretty dense. I suspect it would be easier to digest if you read it while also following their course materials, which are all online [1].

These days I recommend people start with the Lifschitz book [2] and read through the Potassco book [0]. Lifschitz's book is a much gentler introduction to ASP and logic programming in general and its examples are in ASP code (not math). It's also more geared towards the programming side than the solving side, which is probably better for most people until they really want to understand what clingo/gringo/clasp are doing and what their limitations are.

There are other more applied courses, like Adam Smith's Applied ASP course at UCSC [3]. The problems in that course look like a lot of fun.

[0] https://potassco.org/book/

[1] https://teaching.potassco.org

[2] https://www.cs.utexas.edu/users/vl/teaching/378/ASP.pdf, https://www.amazon.com/Answer-Set-Programming-Vladimir-Lifsc...

[3] https://canvas.ucsc.edu/courses/1338


I second the recommendation to start with Lifschitz and move on to the Potassco book from there. To add: One does not need to know Prolog to get into ASP, the semantics are unique and more minimal. That said, I personally struggled with ASP before it clicked, it takes time to grasp the lingo and grok the semantics if you have never worked with something similar. Best to have a guide that introduces the concepts one at a time ("What do you mean, there's more than one type of negation?!")


What problem did you solve with ASP ? I'd like to learn more on them , but I struggle with what problem to start with.


The "Easy ASP" [0] tutorial from Potassco can walk you through some examples, if you'd like.

The playlist is aimed at a general scientific/business audience, the presenter suggests that a lot of natural and business systems can be described in this manner. The presenter also mentions how a Clingo program was used, without modification, to optimize radio frequency band allocation.

Here's a repository [1] of ASP programs in clingo. Under problem classes, I see mostly: game AIs, graph problems, various puzzles, so on.

[0] https://www.youtube.com/playlist?list=PL7DBaibuDD9O4I05DiQfi...

[1] https://asparagus.cs.uni-potsdam.de/


> ... modern SAT solvers are fast, neat and criminally underused by the industry.

Modern SAT solvers are fast, neat and criminally undertaught by the universities. Seriously, why isn't this taught at the undergraduate level in CS70 [1] discrete math type courses or CS170 [2] analysis of algorithms type courses?

[1] https://www2.eecs.berkeley.edu/Courses/CS70/

[2] https://www2.eecs.berkeley.edu/Courses/CS170/


This is… not true. CS170 specifically teaches about reducing NP problems to SAT (you can find this in the Algorithms textbook linked in the class syllabus). I recall solving one of the projects by using MiniSat after converting a problem to 3-SAT. FWIW, the textbook is excellent and the course was very useful.


I definitely recall doing reductions from SAT in Algorithms courses. I think that is a common part of most curricula.

I don't recall being taught any practical uses of SAT. It was introduced only in the context of Cook's theorem, as the problem you needed to reduce to other problems in order to show NP-completeness.

I think most people now learn SAT in that theoretical context, not as a tool to solve problems.


> I definitely recall doing reductions to SAT in Algorithms courses.

> It was introduced only in the context of Cook's theorem, as the problem you needed to reduce other problems to in order to show NP-completeness.

Are you referring to reductions from SAT, or to SAT? You seem to be mentioning both?


Yep, from SAT. Edited. Guess I typed that too fast :)


To clarify, you're specifically talking about reductions to SAT, not from SAT, right?

Note the former is used as a solution technique for feeding into SAT solvers, where the latter's goal is basically the exact opposite (to show NP-hardness and hence algorithmic intractability). Formal methods courses do the former, but algorithms courses usually use SAT for the latter.


In the particular project I was talking about (see https://inst.eecs.berkeley.edu/~cs170/sp20/assets/project/sp... for a similar project) a solution is to reduce the problem to SAT and feed it into a SAT solver. Outside of the project, we were also taught the other way around (demonstrate NP-hardness). See https://people.eecs.berkeley.edu/~vazirani/algorithms/chap8.....


is the first use of former/latter consistent with the second?


I think so? Which part of it sounds inconsistent?


Should 'formal methods courses' go with 'proving NP-completeness' and 'algo courses' go with 'using SAT solvers?'


No. Algorithms courses focus on computability and complexity (including NP-completeness); they don't generally focus on SAT solving. Formal methods are the ones that use SAT solving, SMT solving, etc. to formally prove correctness.


Ah, thank you. We had theory classes with automata and reductions and complexity proofs, and then algorithms classes that covered some solving techniques. I think I mixed up Formal Methods with Theory.


I used to teach formal methods at university, including a course with a lot of SAT examples. We tried to make it as practical as possible, with many examples and exercises in Java (where you just generate your formulas and call a solve method). Thing is, most students seemed to hate it passionately, just like with reductions to SAT in complexity theory.


I wrote a bespoke backtracking solver for a specific class of problems. Would love to use Z3 or something, but frankly, I wouldn't know how to systematically translate problem instances to solver constraints. It's essentially a kind of very complex job-shop scheduling problem with lots of dynamic inter-dependencies. Many of the problems are hard to even solve without dead-locking, while we also naturally strive to minimize overall processing time. Where would I find ressources to help me get a grip on my specific problem [1]? Could I reasonably hope that Z3 or another general solver might be faster than a moderately well designed bespoke solver in a compiled language? (My solver combines a bunch of greedy heuristics with a backtracker in case it runs into a dead-lock, which is often.)

[1] Rough problem outline: Input goods must be transformed through a complex series of assembly/machining/processing lines to output goods; each line transforms one or two inputs into an (intermediary or end-) product; an assembly line produces it's product in a fixed number of time units and cannot be stopped or delayed; finished intermediary products arrive at the end of an assembly line, which must be cleared by then; there are a limited number of temporary storage spaces where intermediary products can be moved to/from in a fixed number of time units; some assembly lines must wait for two intermediary products to be completed to start a job combining them into another intermediary or end product; end products must then be moved to their destinations.


The pdf linked on this site is the biggest collection of SMT examples I know of: https://sat-smt.codes/

I’m no SMT expert, but the way I’ve done it is to make some representation in Python Z3, and then write some function or class that generates those. I was solving MLB eliminations (more complex than it sounds) and I think I used arrays of ints for number of wins. So I’d pull MLB data, turn that into schedule objects which turned themselves into z3 constraints.


This type of problem is more the domain of Constraint Programming (a related field). Job-scheduling problems are pretty much the main focus. I would look up MiniZinc (it even comes with a nice IDE) and see if you can grok it.


What example use-cases did you use? Just curious.


We've had various examples like these:

- Puzzles: Sudoku, St8ts

- Bridge crossing: Missionaries and cannibals, 17 minute bridge crossing (I need a computer to solve this anyway)

- Concurrency: finding bug in mutual exclusion algorithm (Peterson algorithm with a bug)

- Graphs: coloring a map, finding a Hamilton path

- Sorting: Finding bugs in a sorting network

For many students it was difficult to encode problems in SAT. They seemed to understand given example encodings, but then found it difficult to vary them. There is a lot of freedom in how one may encode things and it's hard at first to debug at first when things don't work in the way you're expecting. If there is no solution, then one needs to investigate where there are unwanted constraints or errors in the encoding. If there are unwanted solutions, one needs to identify the missing constraints. It was hard to get across how to do this and it's probably frustrating for beginners.


Both my Alma Maters had courses that used these extensively, and also planners like (Pl|Tr|L)ingeling. We also covered reducability and SAT in multiple courses in both.

These should also be taught in an advanced PL course, e.g liquid, dependent etc types.


We learned SAT solvers and got to implement one in one of our courses. Completely forget which one it was though.


I seem to recall that a poorly scaling sat solver in conda-forge broke so badly in 2020 that it shifted the tectonic plates underneath the entire academic python ecosystem away from conda and towards pip.

    Solving Environment | / - | \


That was always a bit of a red herring, from my understanding. Yes, if you poorly model something into an ad hoc SAT solver, expect slowness.

Which is a bit of the general idea of these being underused. If you can get your problem into a SAT form or three, than feed it to a state of the art solver, it can work amazingly well. But, you will be spending a lot of time moving to and from the SAT formulation.


Do you know of any python dependency managers that do this?


dnf is (or rather, was) written in Python and uses a SAT solver to solve dependencies for package installs in Fedora.


I don't. That said, I think the problems are typically small enough that you don't gain much by hunting for a good SAT formulation? Python doesn't do anything that any other dependency manager does. (Does it?)


DNF uses a SAT solver. It’s even listed first among the motivations for creating DNF:

> DNF is a fork of Yum 3.4 that uses libsolv via hawkey for a backend. The main goals of the project are:

> * using a SAT solver for dependency resolving

> ...

https://fedoraproject.org/wiki/Features/DNF


Fun, I'll have to look at that. The major implication, though, is that yum does the same thing without an explicit sat formulation. Right?

Edit: I will note that the linked doc is silly old. And it seems that the original proposed replacement for YUM was Hawkeye, and is deprecated? But DNF is still steaming ahead? I didn't find any obvious links talking about performance. Did the SAT idea pan out? I'd almost think it was a bust, with some of these wikis. :(


Yes, DNF has replaced Yum by now and definitely uses SAT internally. Here's a link that's probably even older, but more informative:

https://en.opensuse.org/openSUSE:Libzypp_satsolver


Any evidence that the sat formulation helps over the choice of not python? My hunch is it was also somewhat naive python?


I'm not sure what kind of evidence that would be. Version selection is NP-complete, so there is no known algorithm that efficiently solves all problem instances.

You can spend time looking really hard at the problem instances you have and identifying common patterns, and write a complex algorithm that works well as long as the dependencies you are trying to solve at least sort-of follow these patterns. This usually works well until it fails completely, at which point you can look really hard for new patterns in new use cases, and make your complex algorithm handle those as well.

But there's also the option of turning your problem into a SAT (or answer set programming, or constraint programming, or integer programming, etc.) formulation, using an existing SAT solver, and not have to write any complex algorithm at all.


Evidence that it is faster than non sat formulations? So, npm? Whatever go does? Poetry/pip? Java/maven? Or have all of those migrated to sat?


yum used ad-hoc Python code. It was also incredibly slow and often couldn't find a solution.


Isn’t mamba basically Vonda “with a faster silver”?


libsolv is underneath suse and some more Linux distributions. I think conda at some point switched there, too.



From this and Dart, I think one of the lessons here is that SAT solvers are the wrong technique for solving dependencies. SAT solvers find “better” solutions by some metric, but humans using package managers want something which is both simpler and faster.


As an example:

Schaefer's dichotomy theorem shows that, when possible, just make sure to use Horn clauses when possible.

Takes a bit of thinking but is superior to huristics or SAT solvers if you can refactor to allow for it.


Not sure if related, to Schaefer's theorem, but I dove into Answer Set Programming [1] recently, which follows this approach, enabling the use of fast-ish SMT solvers, which are a generalization of SAT solvers! Boolean/Propositional Logic is to Predicate Logic as SAT is to SMT. There's a very nice course about it from the developers of Potassco, one of the best open source Answer Set Programming framework [2].

The syntax looks like Prolog, but predicate negations are a first class citizen, avoids infinite loops.

Prolog's approach is like a depth first search through a search tree -- ASP is like a nondeterministic turing machine, exploring all branches simultaneously from the bottom up.

[1] https://en.wikipedia.org/wiki/Answer_set_programming

[2] https://www.youtube.com/playlist?list=PL7DBaibuDD9O4I05DiQfi...


Wait, this is very relevant to some work I’ve been doing recently, how do you conclude that Horn clauses should be preferred from Schaefer’s theorem?


Take a look at https://en.wikipedia.org/wiki/Boolean_satisfiability_problem... (based on Schaefer's "The complexity of satisfiability problems"). Horn clause satisfiability problems (HORN SAT) fall in P-c.


Oh right this is just Horn clauses, not CHCs


Isn't it just that Horn clauses are easy to understand, and they are guaranteed to be fast.


Dart as in the language? Dart uses a SAT solver: https://github.com/dart-lang/pub/blob/master/doc/solver.md


My gut is that was the point? That Dart uses a SAT solver to no discernible advantage.

I will also note that this amuses me to no end. If you have enough dependencies that you need the speed of a SAT solver.... how many dependencies do you have? And why are they changing so dang much?


The ship of complexity has long sailed for many projects.


I imagine most individual projects are still low enough in dependencies that you still have to try for that to be the slow part.


Conda is still unbearably slow. Mamba is a vastly better mostly drop-in replacement.


Second this. Not only is it faster, but the error messages in Mamba are much more helpful and sane.


Great! Conda honestly can't die fast enough.


Curious to hear about your preferred alternative. Poetry?


Poetry, pip, nix or sending out butterflies to bend cosmic rays to write bits into memory. It really doesn't matter as long as it's not the hellscape that is conda.


spack


the problem wasnot using a sat solver but how they used it.

many package dependency managers use sat solvers since suse spearheaded this in 2007/2008 with libsolv, which is blazing fast for large repositories.

https://research.swtch.com/version-sat


Also there had been a growing trend for most popular packages to offer precompiled wheels on PyPI instead of just sdist releases.

This meant that people who had moved to Conda because they couldn't get Pip to install important packages into their environment took another look and found that actually they could get things installed using Pip now.

At the same time Pip also got a resolver allowing you to have install time confidence you're not installing conflicting package, and recently (Pip 23.1+) the resolver's backtracking got pretty good.

That said Conda mostly solved this (and once mamba is the default resolver engine things will be really fast), Pip is not ever going to be a package manager, and Poetry still isn't an environment manager, and most other Python package/installer alternatives to Conda won't do things like install your Jupyterlab's nodejs dependency.

After many years I now almost exclusively use Pip to install into an environment, but still nothing beats Conda for bootstraping the non-Python-package requirement's (such as Python itself) nor for getting things working when you are in a weird environment that you can't install OS dev libraries into.


Is Conda actually moving towards making mamba the default? Last I heard, they were distinctly uninterested in that, since mamba is implemented in C++, and they would rather rely on their own slow Python code, which they can more easily modify.


Yes they are, it's been integrated and stable in conda since last year, you can turn it on with a solver config set: https://www.anaconda.com/blog/a-faster-conda-for-a-growing-c...


Compiling Scala without a SAT solver is probably too difficult.

The CNF Converter is a gem.

https://github.com/scala/scala/blob/v2.13.5/src/compiler/sca...


Can you expand a bit on why / which bits of Scala compilation this is used for?


It is used for pattern matching.

I don't know anything about the Scala compiler. A few years ago I needed a CNF Converter and I ripped their Logic Module.

(performant CNF Converter are harder to find than SAT Solver)


Nice idea! The pattern matching compiler/optimizer in OCaml doesn't do this. It's implemented using this algorithm which I've attempted to understand a few times but is a bit beyond me:

Fabrice Le Fessant, Luc Maranget, Optimizing Pattern-Matching ICFP'2001 http://pauillac.inria.fr/~maranget/papers/opat/


Just wanted to shoutout Armin Biere, one of the top contributors in this field: https://github.com/arminbiere

He has a few open source SAT solvers and tooling that provide good and proven examples on modern SAT solver techniques.


Conda uses a SAT solver. It is still very slow on degenerate cases and I’m not sure if work to replace it with Microsoft’s SAT solver has started.

https://www.anaconda.com/blog/understanding-and-improving-co...



Has there been any effort to formalise the subset of NP that lends itself to SAT resolution (is there something between x^n and n^x)?

For example, what are the defining characteristics of a graphs for which the travelling salesman problem is resolvable without resorting to brute force?


Related post (and highly recommended!) from yesterday:

The Silent (R)evolution of SAT https://news.ycombinator.com/item?id=36079115


SAT? I had to look it up, so...

Boolean satisfiability problem

https://en.wikipedia.org/wiki/Boolean_satisfiability_problem

"In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE."


Ah I thought initially that these were solving SAT questions. Easy to make mistake, or perhaps just me.


No, I also had no idea about this type of problem and immediately thought this article was about the placement tests administered in the US.


> As an example, the often-talked-about dependency management problem, is also NP-Complete and thus translates into SAT[2][3], and SAT could be translated into dependency manager.

This reminds me of make[0] and of being made aware that make[0] is a SAT solver. I think it was when I attended a conference. Unfortunately I cannot find an authoritative source to quote, so will rely on, and be grateful to, the HN community to correct me should this be wrong.

0 - https://www.gnu.org/software/make/


FWIW, here's a little console-mode puzzle game of SAT problems, if you want to solve some manually. The "board" is not exactly like the example table in the post, since that one was for Sudoku in particular. This grid represents variables as rows and clauses as columns.

https://github.com/darius/sturm/blob/master/satgame.py (Python 2)


This brings back memories. In early 2000s, I wrote my undergrad thesis on a survey of SAT solving techniques. I believe the most capable general solver at the time was called DPLL and used a backtracking approach. My key insight at the time was that if you knew the "shape" of the SAT problem (you had domain specific insight) then you could take some shortcuts by using a custom algorithm. Eg this clause is always false and reduce the search space.


But what do you mean “fast”? If your problem ends up on the steep side of the exponential curve, it’s going to take a while to solve.

I had a lot of fun making my own CDCL solver in Rust, and I’ve really enjoyed messing with Z3 for some theoretical computer science. On all of my explorations, there was a very tangible problem size beyond which the solve time was unusable.

In the case of Z3 with most real world problems, the typical problem size is beyond this limit.


Z3 is actually not a particularly good SAT solver, you really want to use a dedicated tool for pure SAT problems. On the other hand if your problem is in a richer class like QBF or SMT then z3 shines and often you can use encoding tricks to scale problems significantly


There was a time when people thought SAT and formal logic is the way to building AI. Now you don't hear anything about it. I wonder what happened?


The "AI Winter" was largely caused by people realizing building better logic, chess, or similar analytical engines proves to be a poor model for human like intelligence. The current renaissance is due to Machine Learning / Deep Learning based essentially on statistical models.

In the specific context of language there was a famous debate between Chompsky and Norvig that touches on these themes: http://norvig.com/chomsky.html

I believe events of recent years have not been kind to Chompsky's side of this debate. I'm less bullish on large language models turning into AGI than many people here, but I think if we do develop AGI it's a certainty it will be a based on probabilistic models, not logically consistent formalisms alone.


But also not probabilistic models alone, that's the point.


It requires large amounts of formalised and human defined domain specific knowledge, for every domain that you work with.

The overheads are huge, and it's very bad at dealing with fuzzy situations.


And they've only evolved since then! Take a look at SAT COMP, to see the year-on-year evolution of the field.


I really love the clarity + practicality of this article. Super well-written.


Does't Rust use a SAT solver for aspects of its type system?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: