Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsumption and Subsumption Resolution via SAT solving #546

Open
wants to merge 506 commits into
base: master
Choose a base branch
from

Conversation

RobCoutel
Copy link

In this pull request, we completely replace the code for the subsumption and subsumption resolution module by an implementation using a SAT solver to encode the problems.

The papers:

  • 2022: "First-Order Subsumption via SAT Solving." by Jakob Rath, Armin Biere and Laura Kovács
  • 2023: "SAT-Based Subsumption Resolution" by Robin Coutelier, Jakob Rath, Michael Rawson and Laura Kovács
  • 2024: "SAT Solving for Variants of First-Order Subsumption" by Robin Coutelier, Jakob Rath, Michael Rawson, Armin Biere and Laura Kovács

explain the details of the encodings as well as provide proofs of their soundness.

The most important parts of the PR are the following:

  • "SATSubsumption/subsat": A custom SAT solver implemented by Jakob supporting AMO constraints and substitution reasoning
  • "SATSubsumption/SATSubsumptionAndResolution.*": The implementation of clause at a time subsumption and subsumption resolution
  • "Inferences/ForwardSubsumptionAndResolution.cpp": The loop for forward subsumption and subsumption resolution was optimized to mutualize the setup of the SAT solver and the MatchSet used by the subsumption and subsumption resolution module
  • "Inferences/BackwardSubsumptionAndResolution.cpp": The loop was treated similarly as for forward subsumption and subsumption resolution
  • "UnitTests/tSATSubsumptionResolution.cpp": A set of unit tests to check the soundness of our encoding
  • "Saturation/SaturationAlgorithm.cpp": We plugged in the new forward and backward loops in the saturation algorithm.

benchmarking fair

reason: the parser normalizes vampire variables to be contiguous from
0..n, so we would benchmark something that's better for the array-based
  method than it might be in a real run.
For now, we cannot unlock the full potential of this because we still
have to use the original forward subsumption resolution.
Copy link
Contributor

@MichaelRawson MichaelRawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some small changes that could be made, but in general I'm happy. First, congratulations - this is not a small piece of code! Some general points:

  1. At some point in the future I could imagine that we might upgrade Minisat to CaDiCaL, which has user propagation - this could in principle replace the custom SAT solver needed for this, although I suspect it won't be that easy. But this is not something to think about now, just a vague future idea.
  2. Some of the subsat stuff uses stuff that look like the Vampire versions, but aren't - e.g. calling assert rather than ASS. As we depend on invoking the crash routines reliably for e.g. running on remote servers, please ruthlessly hunt these down and replace them with the Vampire versions.
  3. In general the ground shifted a bit during the development of this work, so some things are no longer true - e.g. we have C++17 now, the allocator is no longer compulsory, etc. Where this is obvious I've pointed it out - just to give some background.
  4. Has this been tested for performance reasonably strenuously?
  5. Has this been tested for weird option configurations and weird input problems? I'd be particularly careful about polymorphic inputs. If it doesn't crash on the whole TPTP, then good.

PS A small request - could formatting changes be separated out in future? In general I'm happy to merge formatting changes in Vampire, some of the existing indentation is unconscionable, but I don't want to read it while looking for significant changes.

.gitmodules Outdated Show resolved Hide resolved
@@ -385,7 +385,7 @@ source_group(indexing_source_files FILES ${VAMPIRE_INDEXING_SOURCES})
set(VAMPIRE_INFERENCE_SOURCES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all fine, but @quickbeam123 will appreciate it if you make it work with the Makefile-based build as well, he uses that. As you're not doing anything too strange it should be OK.

@@ -53,6 +56,7 @@ class LiteralMiniIndex
unsigned _cnt;
DArray<Entry> _entries;

// TODO: name is misleading, because "base" means something different when next to "instance" (IteratorBase would already be better)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but could TODO be now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what @JakobR meant here. I did not touch that code.

Indexing/LiteralMiniIndex.hpp Outdated Show resolved Hide resolved
@@ -120,6 +124,23 @@ class LiteralMiniIndex
}
return false;
}

template <class Binder>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I have a few comments here? What do the iterators do, approximately? What is the difference between this hasNext and the original?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. @JakobR ?

SATSubsumption/SATSubsumptionAndResolution.cpp Outdated Show resolved Hide resolved
SATSubsumption/SATSubsumptionAndResolution.cpp Outdated Show resolved Hide resolved
SATSubsumption/SATSubsumptionAndResolution.cpp Outdated Show resolved Hide resolved
SATSubsumption/SATSubsumptionAndResolution.cpp Outdated Show resolved Hide resolved
@RobCoutel
Copy link
Author

Here find the answer of your general comments.

"1. At some point in the future I could imagine that we might upgrade Minisat to CaDiCaL, which has user propagation - this could in principle replace the custom SAT solver needed for this, although I suspect it won't be that easy. But this is not something to think about now, just a vague future idea."

Yes, I agree. This is even one of the claims of the paper if I remember correctly. I am not sure when we will be able to put this purely engineering effort together. But this could be done in the future. Realistically, I am afraid this will remain on the todo list for a long time.
This would require some experimentations, but I am not sure CaDiCaL is the best solution, because it is tuned for very hard problems. In our case, we want the solver to be as fast as possible on small problems. Without further investigation, it is hard to know what would be the impact.
Note that the solver is actually not MiniSAT. It is its own thing, but the AMO and substitution constraints.

"2. Some of the subsat stuff uses stuff that look like the Vampire versions, but aren't - e.g. calling assert rather than ASS. As we depend on invoking the crash routines reliably for e.g. running on remote servers, please ruthlessly hunt these down and replace them with the Vampire versions."

The subsat was entirely written by @JakobR I have to admit that I did not give it a thorough look. I think the idea was to have a Vampire independant module. I wuold like to have Jakob's opinion before doing any significant change there.

"3. In general the ground shifted a bit during the development of this work, so some things are no longer true - e.g. we have C++17 now, the allocator is no longer compulsory, etc. Where this is obvious I've pointed it out - just to give some background.
Has this been tested for performance reasonably strenuously?"

In terms of practical changes, what should I update for the the allocator? I assumed I could abstract it away, but it seems from this comment that I cannot. Could you give me a few pointers there?

As for performance, we have the empirical analysis made for the paper that gave us a 36% increase in speed compared to the old implementation. However, we did not really test the portfolio mode since I was told it is highly tuned with the old subsumption and subsumption resolution code.

"5. Has this been tested for weird option configurations and weird input problems? I'd be particularly careful about polymorphic inputs. If it doesn't crash on the whole TPTP, then good."

I do not think we checked with weird options. But we ran the Benchmarks on the whole of TPTP and I don't think we fond any error there. If @JakobR could confirm that would be nice.

"PS A small request - could formatting changes be separated out in future? In general I'm happy to merge formatting changes in Vampire, some of the existing indentation is unconscionable, but I don't want to read it while looking for significant changes."

Sorry about that. I had an automatic formater that modified a lot of things without me realising it. I tried to manually undo most of it. I will pay more attention to it in the future. I hope it was not too much burden.

@MichaelRawson
Copy link
Contributor

Note that the solver is actually not MiniSAT. It is its own thing, but the AMO and substitution constraints.

True, I should have put this more clearly. Agreed with your points about the engineering effort and CaDiCaL (potentially) being too heavyweight. For the sake of clarity: we do use Minisat internally for other purposes. We could also consider using subsat for other bits of Vampire and removing Minisat in the interest of "no more than one SAT solver at a time".

I also haven't read the subsat code in detail. I will trust @JakobR's considerable expertise and assume it works, otherwise we could be here for months.

In terms of practical changes, what should I update for the the allocator? I assumed I could abstract it away, but it seems from this comment that I cannot. Could you give me a few pointers there?

No, all is well - you can indeed abstract it away. It's just that some things are no longer necessary: STLAllocator or USE_ALLOCATOR will no longer produce a crash if they are missing, for example. They might (but probably are not unless you have measured it) be a good idea for performance.

However, we did not really test the portfolio mode

Probably fine. I don't think many people do, it's too noisy.

I do not think we checked with weird options.

OK. You may find some bugs with random testing but beware of the existing bugs I didn't fix yet.

I hope it was not too much burden.

Not at all.

@JakobR
Copy link
Member

JakobR commented Apr 24, 2024

Thanks for all the comments @MichaelRawson, and thanks @RobCoutel for preparing the PR! I can take care of the subsat-related suggestions (soon). I'll answer more detailed comments then as I go through them.

Some of the subsat stuff uses stuff that look like the Vampire versions, but aren't - e.g. calling assert rather than ASS. As we depend on invoking the crash routines reliably for e.g. running on remote servers, please ruthlessly hunt these down and replace them with the Vampire versions.

Yes, good point. The reason for this is that I tested subsat during development independently from Vampire on some SAT problems. This is also why there is a separate CMake file for it. I will try to integrate it into the main file.

@MichaelRawson
Copy link
Contributor

Thanks @JakobR! I wasn't sure if you had time for this, so it's great you can help us out. And of course there's no particular hurry - either we get this in for CASC (good news) or we don't (also good - then it gets tested for a year), so it's win/win.

@quickbeam123
Copy link
Collaborator

Hi all, for the testing, it might be useful to merge master into this branch now, so that we know we are not diverging.

@quickbeam123
Copy link
Collaborator

Is the only difference in Options.hpp just whitespace cleanup?

(Could you please avoid these in the future? I like to make things look nice as I move around too, but let's only do this manually and for the files you actually touch. If one hot-key press on your side, makes the reviewing (and git blaming) much harder for others in the future, I don't think it has been worth it overall.)

@quickbeam123
Copy link
Collaborator

As @MichaelRawson suggested, would it be possible to make the old Makefile way of compiling/linking work here too?

@JakobR
Copy link
Member

JakobR commented May 7, 2024

Hi @quickbeam123, I'm sorry but I will only get to this next week. If you want to merge it early I can send my changes in response to @MichaelRawson's comments in a separate PR.

@quickbeam123
Copy link
Collaborator

No rush, @JakobR, I only quickly wrote down some "nice to have"s and will in the meantime go for a bit vacation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants