Subsumption and Subsumption Resolution via SAT solving #546

RobCoutel · 2024-04-19T12:35:46Z

In this pull request, we completely replace the code for the subsumption and subsumption resolution module by an implementation using a SAT solver to encode the problems.

The papers:

2022: "First-Order Subsumption via SAT Solving." by Jakob Rath, Armin Biere and Laura Kovács
2023: "SAT-Based Subsumption Resolution" by Robin Coutelier, Jakob Rath, Michael Rawson and Laura Kovács
2024: "SAT Solving for Variants of First-Order Subsumption" by Robin Coutelier, Jakob Rath, Michael Rawson, Armin Biere and Laura Kovács

explain the details of the encodings as well as provide proofs of their soundness.

The most important parts of the PR are the following:

"SATSubsumption/subsat": A custom SAT solver implemented by Jakob supporting AMO constraints and substitution reasoning
"SATSubsumption/SATSubsumptionAndResolution.*": The implementation of clause at a time subsumption and subsumption resolution
"Inferences/ForwardSubsumptionAndResolution.cpp": The loop for forward subsumption and subsumption resolution was optimized to mutualize the setup of the SAT solver and the MatchSet used by the subsumption and subsumption resolution module
"Inferences/BackwardSubsumptionAndResolution.cpp": The loop was treated similarly as for forward subsumption and subsumption resolution
"UnitTests/tSATSubsumptionResolution.cpp": A set of unit tests to check the soundness of our encoding
"Saturation/SaturationAlgorithm.cpp": We plugged in the new forward and backward loops in the saturation algorithm.

…tation for now)

…resolution

benchmarking fair reason: the parser normalizes vampire variables to be contiguous from 0..n, so we would benchmark something that's better for the array-based method than it might be in a real run.

For now, we cannot unlock the full potential of this because we still have to use the original forward subsumption resolution.

…esolution

still need to clean up all the benchamarking things

MichaelRawson

There are some small changes that could be made, but in general I'm happy. First, congratulations - this is not a small piece of code! Some general points:

At some point in the future I could imagine that we might upgrade Minisat to CaDiCaL, which has user propagation - this could in principle replace the custom SAT solver needed for this, although I suspect it won't be that easy. But this is not something to think about now, just a vague future idea.
Some of the subsat stuff uses stuff that look like the Vampire versions, but aren't - e.g. calling assert rather than ASS. As we depend on invoking the crash routines reliably for e.g. running on remote servers, please ruthlessly hunt these down and replace them with the Vampire versions.
In general the ground shifted a bit during the development of this work, so some things are no longer true - e.g. we have C++17 now, the allocator is no longer compulsory, etc. Where this is obvious I've pointed it out - just to give some background.
Has this been tested for performance reasonably strenuously?
Has this been tested for weird option configurations and weird input problems? I'd be particularly careful about polymorphic inputs. If it doesn't crash on the whole TPTP, then good.

PS A small request - could formatting changes be separated out in future? In general I'm happy to merge formatting changes in Vampire, some of the existing indentation is unconscionable, but I don't want to read it while looking for significant changes.

.gitmodules

MichaelRawson · 2024-04-24T09:11:54Z

CMakeLists.txt

@@ -385,7 +385,7 @@ source_group(indexing_source_files FILES ${VAMPIRE_INDEXING_SOURCES})
 set(VAMPIRE_INFERENCE_SOURCES


This is all fine, but @quickbeam123 will appreciate it if you make it work with the Makefile-based build as well, he uses that. As you're not doing anything too strange it should be OK.

MichaelRawson · 2024-04-24T09:13:10Z

Indexing/LiteralMiniIndex.hpp

@@ -53,6 +56,7 @@ class LiteralMiniIndex
  unsigned _cnt;
  DArray<Entry> _entries;

+  // TODO: name is misleading, because "base" means something different when next to "instance"  (IteratorBase would already be better)


Agreed, but could TODO be now?

I am not sure what @JakobR meant here. I did not touch that code.

Indexing/LiteralMiniIndex.hpp

MichaelRawson · 2024-04-24T09:15:29Z

Indexing/LiteralMiniIndex.hpp

@@ -120,6 +124,23 @@ class LiteralMiniIndex
      }
      return false;
    }
+
+    template <class Binder>


Could I have a few comments here? What do the iterators do, approximately? What is the difference between this hasNext and the original?

Good question. @JakobR ?

SATSubsumption/SATSubsumptionAndResolution.cpp

RobCoutel · 2024-04-24T12:35:16Z

Here find the answer of your general comments.

"1. At some point in the future I could imagine that we might upgrade Minisat to CaDiCaL, which has user propagation - this could in principle replace the custom SAT solver needed for this, although I suspect it won't be that easy. But this is not something to think about now, just a vague future idea."

Yes, I agree. This is even one of the claims of the paper if I remember correctly. I am not sure when we will be able to put this purely engineering effort together. But this could be done in the future. Realistically, I am afraid this will remain on the todo list for a long time.
This would require some experimentations, but I am not sure CaDiCaL is the best solution, because it is tuned for very hard problems. In our case, we want the solver to be as fast as possible on small problems. Without further investigation, it is hard to know what would be the impact.
Note that the solver is actually not MiniSAT. It is its own thing, but the AMO and substitution constraints.

"2. Some of the subsat stuff uses stuff that look like the Vampire versions, but aren't - e.g. calling assert rather than ASS. As we depend on invoking the crash routines reliably for e.g. running on remote servers, please ruthlessly hunt these down and replace them with the Vampire versions."

The subsat was entirely written by @JakobR I have to admit that I did not give it a thorough look. I think the idea was to have a Vampire independant module. I wuold like to have Jakob's opinion before doing any significant change there.

"3. In general the ground shifted a bit during the development of this work, so some things are no longer true - e.g. we have C++17 now, the allocator is no longer compulsory, etc. Where this is obvious I've pointed it out - just to give some background.
Has this been tested for performance reasonably strenuously?"

In terms of practical changes, what should I update for the the allocator? I assumed I could abstract it away, but it seems from this comment that I cannot. Could you give me a few pointers there?

As for performance, we have the empirical analysis made for the paper that gave us a 36% increase in speed compared to the old implementation. However, we did not really test the portfolio mode since I was told it is highly tuned with the old subsumption and subsumption resolution code.

"5. Has this been tested for weird option configurations and weird input problems? I'd be particularly careful about polymorphic inputs. If it doesn't crash on the whole TPTP, then good."

I do not think we checked with weird options. But we ran the Benchmarks on the whole of TPTP and I don't think we fond any error there. If @JakobR could confirm that would be nice.

"PS A small request - could formatting changes be separated out in future? In general I'm happy to merge formatting changes in Vampire, some of the existing indentation is unconscionable, but I don't want to read it while looking for significant changes."

Sorry about that. I had an automatic formater that modified a lot of things without me realising it. I tried to manually undo most of it. I will pay more attention to it in the future. I hope it was not too much burden.

MichaelRawson · 2024-04-24T13:11:16Z

Note that the solver is actually not MiniSAT. It is its own thing, but the AMO and substitution constraints.

True, I should have put this more clearly. Agreed with your points about the engineering effort and CaDiCaL (potentially) being too heavyweight. For the sake of clarity: we do use Minisat internally for other purposes. We could also consider using subsat for other bits of Vampire and removing Minisat in the interest of "no more than one SAT solver at a time".

I also haven't read the subsat code in detail. I will trust @JakobR's considerable expertise and assume it works, otherwise we could be here for months.

In terms of practical changes, what should I update for the the allocator? I assumed I could abstract it away, but it seems from this comment that I cannot. Could you give me a few pointers there?

No, all is well - you can indeed abstract it away. It's just that some things are no longer necessary: STLAllocator or USE_ALLOCATOR will no longer produce a crash if they are missing, for example. They might (but probably are not unless you have measured it) be a good idea for performance.

However, we did not really test the portfolio mode

Probably fine. I don't think many people do, it's too noisy.

I do not think we checked with weird options.

OK. You may find some bugs with random testing but beware of the existing bugs I didn't fix yet.

I hope it was not too much burden.

Not at all.

JakobR · 2024-04-24T13:40:06Z

Thanks for all the comments @MichaelRawson, and thanks @RobCoutel for preparing the PR! I can take care of the subsat-related suggestions (soon). I'll answer more detailed comments then as I go through them.

Some of the subsat stuff uses stuff that look like the Vampire versions, but aren't - e.g. calling assert rather than ASS. As we depend on invoking the crash routines reliably for e.g. running on remote servers, please ruthlessly hunt these down and replace them with the Vampire versions.

Yes, good point. The reason for this is that I tested subsat during development independently from Vampire on some SAT problems. This is also why there is a separate CMake file for it. I will try to integrate it into the main file.

MichaelRawson · 2024-04-24T15:22:47Z

Thanks @JakobR! I wasn't sure if you had time for this, so it's great you can help us out. And of course there's no particular hurry - either we get this in for CASC (good news) or we don't (also good - then it gets tested for a year), so it's win/win.

quickbeam123 · 2024-05-02T12:20:44Z

Hi all, for the testing, it might be useful to merge master into this branch now, so that we know we are not diverging.

quickbeam123 · 2024-05-02T12:31:03Z

Is the only difference in Options.hpp just whitespace cleanup?

(Could you please avoid these in the future? I like to make things look nice as I move around too, but let's only do this manually and for the files you actually touch. If one hot-key press on your side, makes the reviewing (and git blaming) much harder for others in the future, I don't think it has been worth it overall.)

quickbeam123 · 2024-05-02T12:31:44Z

As @MichaelRawson suggested, would it be possible to make the old Makefile way of compiling/linking work here too?

JakobR · 2024-05-07T11:33:18Z

Hi @quickbeam123, I'm sorry but I will only get to this next week. If you want to merge it early I can send my changes in response to @MichaelRawson's comments in a separate PR.

quickbeam123 · 2024-05-07T13:29:48Z

No rush, @JakobR, I only quickly wrote down some "nice to have"s and will in the meantime go for a bit vacation :)

JakobR added 30 commits March 22, 2021 10:04

notes

e7af63a

Make arrangement of subsat config more clear

0806f36

Add missing constraints to SR implementation (only the naive implemen…

ed69adb

…tation for now)

Improve comment

7d712e0

Remove old stuff

4f44e8d

improve checking subsumption resolution correctness

702607f

Found bugs both in the old and the new implementation of subsumption …

0a06110

…resolution

Notes on doing something like literalminiindex

ed71284

we need an indirection vampire variable -> array slot to make

50d61f0

benchmarking fair reason: the parser normalizes vampire variables to be contiguous from 0..n, so we would benchmark something that's better for the array-based method than it might be in a real run.

don't do SR for now

94ed196

don't check by default

bde5f66

Also benchmark the setup separately

0016f14

Add a switch to use SMT-(forward-)subsumption during solving

f91b6d7

For now, we cannot unlock the full potential of this because we still have to use the original forward subsumption resolution.

need to set aux

1961d87

oh no...

519f1e4

Show in output which algorithm is used

343f7ee

make it compile on ARM

aa10756

Upgrade google-benchmark

8aba705

mlmatcher rstats

69ea029

Merge branch 'master' into smt-subsumption

922f201

Various fixes due to merge

172f635

disable smt subsumption for data collection

13aaef5

fix linking stage

8c9c666

refactor slog to support subsumption resolution

bf5cbcf

Remove unused code

aa3cad2

no need to record aux for unit clauses

d7bc5fc

skip non-theory variables during theory propagation

49e01b6

disable

be99ee6

check is too early

e681b65

fix SR to not miss inferences (see also PR See also #214)

8a314c8

JakobR and others added 14 commits January 18, 2024 11:53

Add optional max_ticks limit

b604bff

implement cutoff for s/sr

bc1bdab

report number of cutoffs

f88b0e8

Add pruning stats

f1b9ee4

fix

4a9c661

Merge branch 'jakob-sat-sr' into robin_c-subsumption_resolution

ddfecd9

bug in the test

7975a6f

improved the indirect encodign by removing c_j encoding one single b_ij-

afd4652

Merge remote-tracking branch 'origin/HEAD' into robin_c-subsumption_r…

da31ae2

…esolution

fixed the compilation errors

4dc9408

still need to clean up all the benchamarking things

remove all benchmarking code

8f2607e

reverse some of the formating changes

ee9379f

forgot to remove one option

3f2d46b

improved the documentation

40fe242

RobCoutel requested review from JakobR, MichaelRawson and quickbeam123 April 19, 2024 12:35

Robin Coutelier added 2 commits April 19, 2024 14:40

add the missing headers

fd6fa06

remove extra line cause checks to crash

e732f12

MichaelRawson approved these changes Apr 24, 2024

View reviewed changes

implement revisions suggested by Michael

3cefe51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subsumption and Subsumption Resolution via SAT solving #546

Subsumption and Subsumption Resolution via SAT solving #546

RobCoutel commented Apr 19, 2024

MichaelRawson left a comment

MichaelRawson Apr 24, 2024

MichaelRawson Apr 24, 2024

RobCoutel Apr 24, 2024

MichaelRawson Apr 24, 2024

RobCoutel Apr 24, 2024

RobCoutel commented Apr 24, 2024

MichaelRawson commented Apr 24, 2024

JakobR commented Apr 24, 2024

MichaelRawson commented Apr 24, 2024

quickbeam123 commented May 2, 2024

quickbeam123 commented May 2, 2024

quickbeam123 commented May 2, 2024

JakobR commented May 7, 2024

quickbeam123 commented May 7, 2024

		@@ -385,7 +385,7 @@ source_group(indexing_source_files FILES ${VAMPIRE_INDEXING_SOURCES})
		set(VAMPIRE_INFERENCE_SOURCES

Subsumption and Subsumption Resolution via SAT solving #546

Are you sure you want to change the base?

Subsumption and Subsumption Resolution via SAT solving #546

Conversation

RobCoutel commented Apr 19, 2024

MichaelRawson left a comment

Choose a reason for hiding this comment

MichaelRawson Apr 24, 2024

Choose a reason for hiding this comment

MichaelRawson Apr 24, 2024

Choose a reason for hiding this comment

RobCoutel Apr 24, 2024

Choose a reason for hiding this comment

MichaelRawson Apr 24, 2024

Choose a reason for hiding this comment

RobCoutel Apr 24, 2024

Choose a reason for hiding this comment

RobCoutel commented Apr 24, 2024

MichaelRawson commented Apr 24, 2024

JakobR commented Apr 24, 2024

MichaelRawson commented Apr 24, 2024

quickbeam123 commented May 2, 2024

quickbeam123 commented May 2, 2024

quickbeam123 commented May 2, 2024

JakobR commented May 7, 2024

quickbeam123 commented May 7, 2024