Autoresearch for SAT Solvers

(github.com)

64 points | by chaisan 3 hours ago

5 comments

  • stefanpie 1 hour ago
    Prof. Cunxi Yu and his students at UMD is working on this exact topic and published a paper on agents for improving SAT solvers [1].

    I believe they are extending this idea to EDA / chip design tools and algorithms which are also computationally challenging to solve. They have an accepted paper on this for logic synthesis which will come out soon.

    [1] "Autonomous Code Evolution Meets NP-Completeness", https://arxiv.org/abs/2509.07367

    • chaisan 25 minutes ago
      nice. EDA indeed one of the top applications of SAT
  • ktimespi 7 minutes ago
    sounds like AlphaDev [1] might be a better approach for a problem like this.

    [1] https://github.com/google-deepmind/alphadev

  • ericpauley 2 hours ago
    It should be noted that MaxSAT 2024 did not include z3, as with many competitions. It’s possible (I’d argue likely) that the agent picked up on techniques from Z3 or some other non-competing solver, rather than actually discovering some novel approach.
    • dooglius 37 minutes ago
      Is z3 competitive in SAT competitions? My impression was that it is popular due to the theories, the python API, and the level of support from MSR.
      • ericpauley 12 minutes ago
        Funnily, this was precisely the question I had after posting this (and the topic of an LLM disagreement discussed in another thread). Turns out not, but sibling comment is another confounding factor.
    • jmalicki 1 hour ago
      Or for that matter even from later versions of the same solvers that were in its training data!
      • ericpauley 1 hour ago
        True. I’d be curious whether a combination of matching comp/training cutoff and censoring web searches could yield a more precise evaluation.
  • balinha_8864 18 minutes ago
    interesting results but the eval methodology seems a bit optimistic
    • chaisan 3 minutes ago
      its just comparing the cost of the best solution found to the best known cost we had before. O(N). why optimistic?
  • gsnedders 1 hour ago
    What counts as “our cost”? How long it takes to find the MaxSAT?