Complete model #1019

Halbaroth · 2024-01-02T11:07:39Z

This PR fixes the issue #778. The suggested solution by this PR is as follows:

Collect user-declared symbols in D_cnf;
As the function make of the module D_cnf returns a list of
commands, we add a new command Decl in Frontend to declare
symbols;
The declared symbols are given to the SAT solver through a new
function declaration in sat_solver_sig;
These symbols are collected in vectors and we keep indexes in another
vector in order to manage properly the push/pop mechanism.
Finally the declared symbols are given to Uf and we create the empty
model with them.

Another annoying point comes from the current data structure used in
ModelMap. The type of ModelMap saves model values in the form of graphs.
A graph is a finite set of constraints but there is no appropriate
representation for the graph without constraints. Let's imagine we have
the input:

(set-option :produce-models true)
(set-logic ALL)
(declare-fun (Int) Int)
(check-sat)
(get-model)

We expect an output as follows:

(
  (define-fun f (_x Int) (@A as Int))
)

But Graph.empty cannot hold the abstract value @a. A naive solution
consists in using an ADT:

type graph =
| Empty of Id.typed
| G of Graph.t

But we'll add an extra indirection. Probably the best solution would be
to use a custom data structure instead of a map to store the constraints.
In this PR, we detect in ModelMap.add if the only constraint in the
graph is of the form something -> abstract value. In this case we
remove this constraint.

This solution isn't perfect as explained in issue #1018. The biggest
drawback of this solution is the fact we have to declare symbols in the
SAT API, otherwise models returned by get_model could be
incomplete.

bclement-ocp

This PR adds a mechanism to keep track of symbols that were declared, and ensure that all declared symbols are defined in generated models. The general design is sound, but I would like the implementation of ModelMap to be changed to no longer assume that expressions can't be abstract names. The unused DeclSet functors should also be removed.

bclement-ocp · 2024-01-02T11:29:21Z

src/lib/frontend/d_cnf.ml

@@ -53,6 +53,12 @@ module HT = Hashtbl.Make(
    let hash = Fun.id
  end)

+module DeclSet = Set.Make
+    (struct
+      type t = Id.typed


Why do we need to use the type here? Can we have multiple declarations with the same name but different types?

If this is because we need the type in Uf, this should be a map from identifiers to types rather than a set of typed identifiers.

Actually this doesn't seem to be used.

Sorry, I forgot to remove it ;)

src/lib/frontend/d_cnf.ml

bclement-ocp · 2024-01-02T11:32:40Z

src/lib/frontend/d_cnf.ml

+      let decl = match td with
+        | `Type_decl (td, _def) ->
+          mk_ty_decl td;
+          None
+
+        | `Term_decl td ->
+          Some (mk_term_decl td)
+      in
+      begin match decl with
+        | Some d ->
+          let st_loc = dl_to_ael dloc_file loc in
+          C.{ st_decl = Decl d; st_loc } :: acc
+        | None ->
+          acc
+      end


No need to build an option to destruct it immediately:

Suggested change

let decl = match td with

| `Type_decl (td, _def) ->

mk_ty_decl td;

None

| `Term_decl td ->

Some (mk_term_decl td)

in

begin match decl with

| Some d ->

let st_loc = dl_to_ael dloc_file loc in

C.{ st_decl = Decl d; st_loc } :: acc

| None ->

acc

end

begin match td with

| `Type_decl (td, _def) ->

mk_ty_decl td;

acc

| `Term_decl td ->

let d = mk_term_decl td in

let st_loc = dl_to_ael dloc_file loc in

C.{ st_decl = Decl d; st_loc } :: acc

end

bclement-ocp · 2024-01-02T11:38:00Z

src/lib/frontend/d_cnf.ml

          end
      in
-      aux [] dcl;
-      acc
+      aux [] dcl @ acc


Nit: acc could be passed as a new 3rd argument to aux here (to return instead of [] in the final case) so that we avoid building a list just to concatenate it to another one.

bclement-ocp · 2024-01-02T11:42:39Z

src/lib/reasoners/fun_sat.ml

+    declare_ids : Id.typed Vec.t;
+    declare_lim : int Vec.t;


There is no need to do this dance in Fun_sat which is used as a functional data structure, this can simply be a declare_ids : Id.typed list field that is updated functionally ({ env with declare_ids = id :: env.declare_ids }) in declare.

This was my first implementation and it didn't work. A functional solution would be to use a list of lists:

declare_stack: Id.typed list list

We have to pop/push this stack in Fun_sat.pop/Fun_sat.push.
I didn't adopt this solution because I was afraid that a list of lists isn't cache-friendly. I understand that you want to keep the environment of FunSAT as pure as possible.

Honestly, I believe we should pull up all the imperative stuff of FunSAT into Funsat_frontend and use a stack of environment of FunSAT to simulate the push and the pop functions. In this design, we'll use a vector for declarations in the imperative environment of Funsat_frontend.

I also tried to implement the feature in Funsat_frontend without modifying the environment of FunSAT but we have no access to the right environment of Theory here.

Ah, right, push and pop forces us to be imperative. Oh well. I guess the way to do this while staying consistent with the current Fun_sat module is Id.typed list Stack.t rather than Id.typed list list then.

I didn't adopt this solution because I afraid that a list of lists isn't cache-friendly.

You did write too much C++ :) A list of lists (or a stack of lists, which is the same thing) is very probably not going to be an issue here because we never actually look at the lists until model generation (and also because this only impacts calls to push and pop, which is virtually 0% of Alt-Ergo's runtime).

In this design, we'll use a vector for declarations in the imperative environment of Funsat_frontend.

If we use Funsat_frontend for push and pop we don't need a separate vector for declarations in Funsat_frontend, we can simply have declare_ids : Id.typed list in Funsat which will get tracked by the main stack in the frontend, I believe.

I agree to use Id.typed list Stack.t :)

bclement-ocp · 2024-01-02T11:44:40Z

src/lib/reasoners/satml_frontend.ml

+    declare_ids : Id.typed Vec.t;
+    declare_lim : int Vec.t;


It would be simpler to use a vector of lists here:

mutable declare_ids : Id.typed list ; declare_queue : Id.typed list Vec.t ;

Uhm, Id.typed list Vec.tis a stack, not a queue in your suggestion, right?

I adopt this solution because it's the current style of SatML.

Yes it's a stack (for some reasons they are called "queues" elsewhere in satML but we can use the proper teminology ;) ).

So why do we need an extra field declare_ids here? The top element of the stack can be accessed in O(1). I believe

declare_stack: Id.typed list Vec.t

is sufficient.

Even if the algorithmic complexity is identical, the constant term is not the same; having a separate field here avoids having to check that the vector is not empty on all accesses. It probably does not matter here because we never look at the declarations except during model generation but for the other vectors in SatML which are accessed much more frequently it might matter.

It's fine to use a non-empty Vec.t here (or even a Stack.t from the stdlib actually).

I will use the current design with an extra field.

bclement-ocp · 2024-01-02T11:47:57Z

src/lib/structures/modelMap.ml

  let values = P.add sy (Graph.add arg_vals ret_val graph) values in
  { values; suspicious }

-let empty ~suspicious = { values = P.empty; suspicious }
+module DeclSets = Set.Make


This doesn't seem used.

bclement-ocp · 2024-01-02T11:49:00Z

src/lib/structures/modelMap.ml

+         it means there is no constraint on this graph. We replace it by
+         the graph with the only constraint given by [arg_vals] and
+         [ret_val]. *)
+      if Graph.cardinal graph == 1 then


Don't use == to compare integers.

Oups, I did too much C++ lately :)

bclement-ocp · 2024-01-02T12:16:24Z

src/lib/structures/modelMap.ml

+         let arg_vals = List.map Expr.mk_abstract arg_tys in
+         let ret_val = Expr.mk_abstract ret_ty in
+         let graph = Graph.add arg_vals ret_val Graph.empty in
+         P.add sy graph values


It looks like you are using a graph with a single "dummy" definition here to then detect it in add and fix things up if the graph actually has definitions. This is quite hacky in part because it introduces a global invariant that abstract values are never otherwise placed into graphs — I believe it is true currently (although I actually had to check in several places to say so) but could easily cause subtle bugs in the future due to unrelated refactorings.

Your suggestion in the PR message to use an ADT (or a record carrying an explicit default value) would be better here; I don't think the extra indirection will be a problem (especially given that it will only impact model generation).

I know this is a hackish solution. I think the best solution would be to add a function to generate dummy values. If you run cvc5 or z3 on an input file with completely unused symbols, they generate dummy values for them. I didn't do it in this PR because I want a small modification here in order to implement (get-value).

Besides, I'm not sure to understand what you mean by a global invariant here. The add replaces a valid model by another valid model and you don't need to maintain an invariant outside ModelMap because the only way to add new constraints in your graph is by using ModelMap.add.

I agree the indirection isn't a big deal.

I'm not sure to understand what you mean by a global invariant here. The add replaces a valid model by another valid model and you don't need to maintain an invariant outside ModelMap because the only way to add new constraints in your graph is by using ModelMap.add.

I mean that in the current implementation it is incorrect to call ModelMap.add with a ret_val that is an abstract value:

The graph is initially { [@a0] -> @a1 } as created by empty

Adding the mapping 0 -> @x, the graph becomes { [0] -> @x } because it only contains a single abstract value

Adding the mapping 1 -> @y, the graph becomes { [1] -> @y } because it only contains a single abstract value

But this is incorrect and now we will print (define-fun f ((_ Int)) T @y) but (f 0) must be @x, not @y (abstract values with different names are distinct).

Uhm right, it should be fine with the current state of the code but it's a fragile design. I'll fix it.

bclement-ocp · 2024-01-02T12:47:24Z

tests/models/complete_models.models.smt2

I can't really determine whether this file is intended to test model generation or the interaction of model generation with incremental solving.

In general it is better to have smaller tests so that it is easier to understand what they are actually testing — for instance, is it important to have x and z in the last get-model here, or are we actually testing model generation for functions?

I'd prefer if you could split this test into smaller files, ideally with a single (check-sat) per problem file.

(As an aside: are we currently guaranteed to have the same abstract values in two consecutive calls to (get-model)?)

bclement-ocp

LGTM — just a couple nits and one question about popping too much but it is not really related to this PR per se.

bclement-ocp · 2024-01-03T09:52:58Z

src/lib/reasoners/fun_sat.ml

@@ -177,6 +177,12 @@ module Make (Th : Theory.S) = struct
    unit_facts_cache : (E.gformula * Ex.t) ME.t ref;
    last_saved_model : Models.t Lazy.t option ref;
    unknown_reason : Sat_solver_sig.unknown_reason option;
+
+    declare_top : Id.typed list ref;


Nit: unless there is a reason to pass the ref around, it is usually better to use mutable record fields rather than refs, both for performance (it is slightly more efficient, although as we have established it won't matter here) and to make the intent more clear, in particular that there will be no sharing of the field.

It was my first attempt of course for sake of performance but it doesn't work. It seems we need to share this field between several environments so we cannot avoid the double indirection with the current implementation.

Hm, that may sound like a bug? We do create new environments (copies) all the time, but the old ones should no longer be used (we overwrite the fields in Fun_sat_frontend). The pop issue should also no longer be relevant, because we can write to the mutable field here. Can you elaborate on what "it doesn't work" means?

I will check what's happened. When I tried using a mutable field instead of a reference, FunSAT keeps declarations of popped assertion levels. Then I switched to a reference (I only did this modification) and it magically worked.

Note that with a mutable field you need to set acc.declare_top <- Stack.pop acc.declare_tail not env.declare_top <- Stack.pop env.declare_tail in the pop function (which I'd guess is the issue given what you describe).

bclement-ocp · 2024-01-03T09:56:32Z

src/lib/reasoners/fun_sat.ml

+            try env.declare_top := Stack.pop env.declare_tail
+            with Stack.Empty ->
+              (* Happens if we perform more pops than pushes. *)
+              invalid_arg "Fun_sat.pop"


Can this actually happen in practice, or is there some mechanism to catch it before the fact? We shouldn't show a naked exception to the user in this case.

There is no mechanism to catch it if you use the SAT API with the library. But there is a mechanism to prevent it with the binary and this hero is Dolmen :)

I may create a new error in Error to manage this case?

An exception is appropriate for library use, thanks for checking!

bclement-ocp · 2024-01-03T09:59:59Z

src/lib/structures/modelMap.ml

+      (fun graph ->
+         match graph with
+         | C constraints -> C (Constraints.map (subst_in_term id e) constraints)
+         | Free a -> Free (subst_in_term id e a)


Suggested change

| Free a -> Free (subst_in_term id e a)

| Free _ -> graph

It was my first code but let's imagine you're using the library and you get a model with SAT.get_model. One of the values of the model is an abstract value. You may want to substitute it by something else. Sometimes, it will work because this is not a free graph, sometimes the function subst silently ignores your request.

My code isn't good because after doing this substitution, the user could add a new constraint to the graph but the current add function will erase the effect of the previous substitution.

The model is an output value of the solver, I don't think the use case where we allow people to alter the model after the fact is one that we want to support. If people want to change the model they need to add assertions to the problem and call solve again to get a new model.

Under that assumption, the semantics of the Free constructor is "this function is unconstrained", and the inner abstract value is a "dummy" name that is guaranteed to not occur anywhere else in the model (by construction). Given that semantics, we shouldn't substitute under the Free constructor.

If we want to have a proper semantics for the graphs, the only easy way I see given the current architecture is a record with a default value instead, and print something like (define-fun f (x Int) (ite (= x 0) 0 @a0)) when then only constraint is [0] -> 0 — but again, I don't think we need to do this.

Under that assumption, the semantics of the Free constructor is "this function is unconstrained", and the inner abstract value is a "dummy" name that is guaranteed to not occur anywhere else in the model (by construction). Given that semantics, we shouldn't substitute under the Free constructor.

Note, I don't care either way — I think both implementations are OK as long as we don't expect the user to modify the model. If we do, neither implementation is correct as you point out (but again — I don't think we want to support this).

I agree we don't need to support it. At some point, we should stop to expose subst in the library.

bclement-ocp · 2024-01-09T16:49:26Z

@Halbaroth should we merge this?

bclement-ocp · 2024-01-16T16:09:11Z

@Halbaroth second ping; are we good to merge or did you want to make additional changes?

@A

This PR fixes the issue OCamlPro#778. The suggested solution by this PR is as follows: - Collect user-declared symbols in `D_cnf`; - As the function `make` of the module `D_cnf` returns a list of commands, we add a new command `Decl` in `Frontend` to declare symbols; - The declared symbols are given to the SAT solver through a new function `declaration` in `sat_solver_sig`. - These symbols are collected in vectors and we keep indexes in another PR in order to manage properly the push/pop mechanism. - Finally the declared symbols are given to `Uf` and we create the empty model with them. Another annoying point comes from the current datastructure used in `ModelMap`. The type of `ModelMap` saves model values in the form of graphs. A graph is a finite set of constraints but there is no appropriate representation for the graph without constraints. Let's imagine we have the input: ```smt2 (set-option :produce-models true) (set-logic ALL) (declare-fun (Int) Int) (check-sat) (get-model) ``` We expect an output as follows: ```smt2 ( (define-fun f (_x Int) (@A as Int)) ) ``` But `Graph.empty` cannot hold the abstract value `@a`. A naive solution consists in using an ADT: ```ocaml type graph = | Empty of Id.typed | G of Graph.t ``` But we'll add an extra indirection. Probably the best solution would be to use a custom datastructure instead of a map to store the constraints. In this PR, we detect in `ModelMap.add` if the only constraint in the graph is of the form `something -> abstract value`. In this case we remove this constraint. This solution isn't perfect as explained in issue OCamlPro#1018. The biggest drawback of this solution is the fact we have to declare symbols in the SAT API, otherwise models returned by `get_model` could be incomplete.

Halbaroth force-pushed the complete-model branch from cecfab9 to d9fa809 Compare January 2, 2024 11:24

Halbaroth added the models This issue is related to model generation. label Jan 2, 2024

Halbaroth added this to the 2.6.0 milestone Jan 2, 2024

bclement-ocp requested changes Jan 2, 2024

View reviewed changes

Halbaroth requested a review from bclement-ocp January 3, 2024 08:59

Halbaroth linked an issue Jan 3, 2024 that may be closed by this pull request

Wrong model in the non-regression tests #778

Closed

bclement-ocp approved these changes Jan 3, 2024

View reviewed changes

Halbaroth force-pushed the complete-model branch from 4206878 to 7c910bf Compare January 3, 2024 11:54

bclement-ocp mentioned this pull request Jan 8, 2024

Invalid SAT environment while optimizing #1023

Open

Halbaroth added 13 commits January 17, 2024 14:14

Move typed symbols from ModelMap to Id

b29a119

Move comment

8a556a4

Remove useless option

3e2d501

Remove useless DeclSet module

6e499aa

We aren't in C++ :)

02208a1

Split the test

e07bfa0

Use a stack of the stdlib for declare_ids

b073017

Use a stack of the stdlib for declared symbols in SatML

ad415e5

Use an ADT to discriminate free graph and constant graph

14e729d

Avoid list concatenation

c65d8f2

Add a new error in case of stack underflow

359ac56

Don't perform substitution into free graphes

98899b9

Halbaroth force-pushed the complete-model branch from 7c910bf to 98899b9 Compare January 17, 2024 13:14

Halbaroth merged commit 373f903 into OCamlPro:next Jan 18, 2024
13 checks passed

Complete model #1019

Complete model #1019

Conversation

Halbaroth commented Jan 2, 2024 • edited

bclement-ocp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Halbaroth Jan 2, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Halbaroth Jan 2, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Halbaroth Jan 2, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bclement-ocp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Halbaroth Jan 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Halbaroth Jan 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Halbaroth Jan 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bclement-ocp commented Jan 9, 2024

bclement-ocp commented Jan 16, 2024

Halbaroth commented Jan 2, 2024 •

edited

Halbaroth Jan 2, 2024 •

edited

Halbaroth Jan 2, 2024 •

edited

Halbaroth Jan 2, 2024 •

edited

Halbaroth Jan 3, 2024 •

edited

Halbaroth Jan 3, 2024 •

edited

Halbaroth Jan 3, 2024 •

edited