Add dichotomic symbolic clz and ctz #195

krtab · 2024-02-26T18:07:54Z

This is supposedly SMT friendly, and keeps the
bit-hacks for the concrete case.

zapashcanon · 2024-02-26T18:21:57Z

src/interpret.ml

+      end
+    in
+    let> cond = I32.(eqz n) in
+    if cond then return @@ const 32l else aux 0 32


Could you explain to me quickly why this can't be in symbolic_value directly ?

Because currently, code in symbolic_value has no reference to the Choice monad

What about putting it in symbolic.ml then ?

Replacing:

module Value = Symbolic_value

By:

module Value = struct include Symbolic_value let clz ... end

?

I can try but it is awkward as well because we would have to separate cases for I64 and I32 in Symbolic.value

Oh yes I forgot about this. I still believe you can do it "properly":

module I32 = struct include Symbolic_value.I32 let clz = ... end module I64 = struct ... same ... end module Value = struct include Symbolic_value module I32 = I32 module I64 = I64 end

zapashcanon · 2024-02-26T18:25:07Z

Do you have approximate numbers on how much faster is it ?

filipeom · 2024-02-26T18:49:36Z

This is super cool! I'd love to see the differences between this branching approach and using ite expressions inside aux. Solver-wise, it would probably take much more time, but it might be more memory-friendly?

EDIT: maybe I'll write one such implementation in encoding and see 😅

krtab · 2024-02-27T08:42:09Z

@zapashcanon :

Do you have approximate numbers on how much faster is it ?

Well the previous implementation when run on tests gave

owi: internal error, uncaught exception:
     Failure("Bv: Unsupported Z3 unary operator \"clz\"")

So it may have been faster but certainly less useful 😉

@filipeom

This is super cool! I'd love to see the differences between this branching approach and using ite expressions inside aux. Solver-wise, it would probably take much more time, but it might be more memory-friendly?

Could you clarify what you mean by "ite expressions" ?

This is supposedly SMT friendly, and keeps the bit-hacks for the concrete case.

krtab · 2024-02-27T17:41:39Z

I also added ctz in a similar vein.
@zapashcanon we can see on thursday how to better organize this code

filipeom · 2024-02-27T23:17:04Z

Could you clarify what you mean by "ite expressions" ?

I mean the ternary operator if-then-else. It would allow us to write the same code in a single expression. Here is the pseudo code for the clz algorithm you implemented, using the ternary operator (ite):

let clz n =
  let rec aux (lb : int) ub =
    if ub = lb + 1 then I32.const_i32 (Int32.of_int (32 - ub))
    else
      let mid = (lb + ub) / 2 in
      let two_pow_mid = Int32.shl 1l (Int32.of_int mid) in
      let cond = I32.lt_u n (I32.const_i32 two_pow_mid) in
      ite cond (aux lb mid) (aux mid ub)
  ite (I32.eqz n) (I32.const_i32 32l) (aux 0 32)

I did a quick and dirty implementation of this directly in Z3, which you can find here: formalsec/encoding@2f83d34 (hope you don't mind!)

After that, I measured execution times and paths explored on clz_64.wast. The results are as follows:

Approach	Number of Paths	User time (s)	Wall clock (s)
branching	2081	192.30	37.17
ite-exprs	65	6.19	1.75

I referred to your approach as branching and the other as ite-exprs. It appears that the use of branching in the ctz implementation is significantly penalizing.

I also experimented with removing the second assert in the test, and the performance was much faster, essentially identical between the two approaches, with the same number of paths as well.

Perhaps it's worth providing an implementation for these operators in encoding and see how they compare against this approach. What do you think?

krtab · 2024-02-28T13:02:23Z

Hi @filipeom !

Very interesting thank you. A few points:

First of all, I agree that my current implementations is likely to not be the best one, other possibilities include: Z3's ite expressions (as you implemented), changing the choice decision tree for a direct expression of 32 choices each with only two bounds added to the path condition, changing the < check for a more bitlevel one etc.
However, actually choosing the best one can only be done with more (and more realistic) benchmarks IMO. We not only want an implementation that is "locally efficient", meaning that it quickly allows owi-sym to process a single clz instruction, we also want one that makes it efficient to reason on the newly added path conditions afterwards. My intuition of Z3 inner working is way too poor to guess what will be the best choice in general, but having benchmarks on realistic code (and not a unit test like here) will help.
"I also experimented with removing the second assert in the test, and the performance was much faster, essentially identical between the two approaches, with the same number of paths as well."
Interesting, I wonder what makes it so. Maybe the clz-based ctz implementation forces Z3 to exhaustively check all possible bit configurations?

filipeom · 2024-02-28T13:37:49Z

... We not only want an implementation that is "locally efficient", meaning that it quickly allows owi-sym to process a single clz instruction, we also want one that makes it efficient to reason on the newly added path conditions afterwards ...

Yes, you are correct. However, I want to note that having easier-to-reason path conditions might not be as helpful when there are exponentially as many branches to check for satisfiability. Even though you can quickly reason about them in the SMT solver, it requires exponential memory usage to do so. Regardless, even if I prefer approaches that do not branch to encode such operations, I must admit that this approach might be better in cases where there is a good path exploration strategy.

Interesting, I wonder what makes it so. Maybe the clz-based ctz implementation forces Z3 to exhaustively check all possible bit configurations?

I think it's because $countLeadingZeros will always branch, so it ends up losing the ability to merge all 64 states with the ite approach.

But in summary:

Branching in these operations can lead to an earlier state explosion situation.
It might be the case that this approach can lead to easier path-specific reasoning, allowing for faster bug finding.
More benchmarking should be done in actual programs that use clz or similar operations that reason about the entire bitvector.
I dislike these extreme branching approaches as they create spurious paths and subsequently generate test cases that don't contribute to any increase in line coverage of the program under testing.

krtab requested a review from zapashcanon February 26, 2024 18:07

zapashcanon reviewed Feb 26, 2024

View reviewed changes

krtab force-pushed the sym_clz branch from c4040c1 to 9087289 Compare February 27, 2024 14:33

krtab added 2 commits February 27, 2024 18:40

Add dichotomic symbolic clz

782291f

This is supposedly SMT friendly, and keeps the bit-hacks for the concrete case.

Add dichotomic symbolic ctz

7f248d6

krtab force-pushed the sym_clz branch from 9087289 to 7f248d6 Compare February 27, 2024 17:40

krtab changed the title ~~Add dichotomic symbolic clz~~ Add dichotomic symbolic clz and ctz Feb 27, 2024

filipeom mentioned this pull request Feb 28, 2024

Adds Arthurs implementation for clz and ctz formalsec/smtml#81

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dichotomic symbolic clz and ctz #195

Add dichotomic symbolic clz and ctz #195

krtab commented Feb 26, 2024

zapashcanon Feb 26, 2024

krtab Feb 26, 2024

zapashcanon Feb 27, 2024

krtab Feb 27, 2024

zapashcanon Feb 27, 2024

zapashcanon Feb 27, 2024

zapashcanon commented Feb 26, 2024

filipeom commented Feb 26, 2024 •

edited

krtab commented Feb 27, 2024

krtab commented Feb 27, 2024

filipeom commented Feb 27, 2024 •

edited

krtab commented Feb 28, 2024 •

edited

filipeom commented Feb 28, 2024 •

edited

Add dichotomic symbolic clz and ctz #195

Are you sure you want to change the base?

Add dichotomic symbolic clz and ctz #195

Conversation

krtab commented Feb 26, 2024

zapashcanon Feb 26, 2024

Choose a reason for hiding this comment

krtab Feb 26, 2024

Choose a reason for hiding this comment

zapashcanon Feb 27, 2024

Choose a reason for hiding this comment

krtab Feb 27, 2024

Choose a reason for hiding this comment

zapashcanon Feb 27, 2024

Choose a reason for hiding this comment

zapashcanon Feb 27, 2024

Choose a reason for hiding this comment

zapashcanon commented Feb 26, 2024

filipeom commented Feb 26, 2024 • edited

krtab commented Feb 27, 2024

krtab commented Feb 27, 2024

filipeom commented Feb 27, 2024 • edited

krtab commented Feb 28, 2024 • edited

filipeom commented Feb 28, 2024 • edited

filipeom commented Feb 26, 2024 •

edited

filipeom commented Feb 27, 2024 •

edited

krtab commented Feb 28, 2024 •

edited

filipeom commented Feb 28, 2024 •

edited