Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: enhancing code coverage with external tracers #1269

Open
alexsetsove opened this issue Aug 11, 2023 · 6 comments
Open

Question: enhancing code coverage with external tracers #1269

alexsetsove opened this issue Aug 11, 2023 · 6 comments

Comments

@alexsetsove
Copy link

In order to increase the code coverage, I want to solve the unsolved branches. Sample target does only string comparison (strcmp) .

My tarcer was IntelPin/Dynamorio, I sent execution information to Triton like instructions and registers, and updated memory via Triton callbacks. Update will lose path constraints, like issue #1128 . In this situation, I don't know what to do. I'm going to send and want to solve some specific execution regions.

Any leads like pseudo code or workflow that can clarify the path are appreciated.

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented Aug 11, 2023

You have to update the memory only if there is a desynchronization between your real CPU state (via IntelPin/Dynamorio) and the Triton's state

Take a look at the mem_read and synch_regs functions in http://shell-storm.org/repo/Notepad/qbdi_with_triton.txt

@alexsetsove
Copy link
Author

Thank you so much for your response @JonathanSalwan .

I did it and my path constraint is still empty. As soon as I enabled ONLY_ON_TAINTED, and the memory region containing my data was symbolized and tainted, so the path constraints were filled and the number of paths was less than when ONLY_ON_SYMBOLIZED disabled (I take that as a good sign). That's the right way to do it?

However, when I tried to solve branches, I got junk data, I expected the strings used in comparison.

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented Aug 11, 2023

However, when I tried to solve branches

How do you solve a branch?

If you want to solve a branch you have to take into account previous constraints too (what we call the path predicate). The constraint must be something like this getModel(previous_constraints && current_branch_constraint). You can take a look to this algorithm or simply do something like below using getPathPredicate() that returns previous constraints of your current path:

ast = ctx.getAstContext()

pred = ctx.getPathPredicate() # returns the path predicate
crt = ast.land([pred, <your constraint that goes to your branch here>])
model = ctx.getModel(crt)

@alexsetsove
Copy link
Author

alexsetsove commented Aug 11, 2023

Here are the whole instructions for strstr in libc fom my tracer:

0x7fd22bcc420f    movzx  edx, byte ptr [rsi+0x01]  
0x7fd22bcc4213    test   dl, dl  
0x7fd22bcc4215    jz     0x00007fd22bcc42d0  
0x7fd22bcc421b    movd   xmm1, eax  
0x7fd22bcc421f    movd   xmm2, edx  
0x7fd22bcc4223    mov    rax, rdi  
0x7fd22bcc4226    and    eax, 0x00000fff  
0x7fd22bcc422b    punpcklbw xmm1, xmm1  
0x7fd22bcc422f    cmp    rax, 0x00000fbf  
0x7fd22bcc4235    punpcklbw xmm2, xmm2  
0x7fd22bcc4239    punpcklwd xmm1, xmm1  
0x7fd22bcc423d    punpcklwd xmm2, xmm2  
0x7fd22bcc4241    pshufd xmm1, xmm1, 0x00  
0x7fd22bcc4246    pshufd xmm2, xmm2, 0x00  
0x7fd22bcc424b    jnbe   0x00007fd22bcc4550  
0x7fd22bcc4251    movdqu xmm3, oword ptr [rdi]  
0x7fd22bcc4255    pxor   xmm5, xmm5  
0x7fd22bcc4259    movdqu xmm4, oword ptr [rdi+0x01]  
0x7fd22bcc425e    movdqa xmm6, xmm3  
0x7fd22bcc4262    pcmpeqb xmm3, xmm1  
0x7fd22bcc4266    pcmpeqb xmm4, xmm2  
0x7fd22bcc426a    movdqu xmm0, oword ptr [rdi+0x10]  
0x7fd22bcc426f    pcmpeqb xmm6, xmm5  
0x7fd22bcc4273    pminub xmm3, xmm4  
0x7fd22bcc4277    movdqa xmm4, xmm3  
0x7fd22bcc427b    movdqu xmm3, oword ptr [rdi+0x11]  
0x7fd22bcc4280    pcmpeqb xmm5, xmm0  
0x7fd22bcc4284    pcmpeqb xmm3, xmm2  
0x7fd22bcc4288    por    xmm4, xmm6  
0x7fd22bcc428c    pcmpeqb xmm0, xmm1  
0x7fd22bcc4290    pminub xmm0, xmm3  
0x7fd22bcc4294    por    xmm0, xmm5  
0x7fd22bcc4298    pmovmskb r8d, xmm4  
0x7fd22bcc429d    pmovmskb eax, xmm0  
0x7fd22bcc42a1    shl    rax, 0x10  
0x7fd22bcc42a5    or     r8, rax  
0x7fd22bcc42a8    jz     0x00007fd22bcc4310  
0x7fd22bcc42aa    bsf    rax, r8  
0x7fd22bcc42ae    add    rax, rdi  
0x7fd22bcc42b1    cmp    byte ptr [rax], 0x00  
0x7fd22bcc42b4    jz     0x00007fd22bcc42f4  
0x7fd22bcc42f4    xor    eax, eax  
0x7fd22bcc42f6    ret  

This is the first constraint from ctx.getPathConstraints(), which is MultipleBranches:

13:56:07 [DEBUG   ] branch : isTaken : False, srcAddr : 0x7fd22bcc42a8, dstAddr : 0x7fd22bcc4310

But comparison will be done via XMM registers. What should I do?

Another thing, is it okay that I got an exception when building the semantic (ctx.buildSemantics(inst))? Triton can disassemble the opcode, as you can see.

13:56:59 [DEBUG   ] TR => 0x7f6b38da11b3	C5F877	vzeroupper
13:56:59 [CRITICAL] problem FAULT_UD :	C5F877	vzeroupper

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented Aug 12, 2023

But comparison will be done via XMM registers. What should I do?

I'm not sure to get this point. If you receive a model from the solver, you just have to inject the model into symbolic variables that you defined at the beginning (probably in memory cells).

Another thing, is it okay that I got an exception when building the semantic (ctx.buildSemantics(inst))? Triton can disassemble the opcode, as you can see.

If you receive a FAULT_UD, it means that we do not support the semantic of the given opcode. Yep, we can disassemble it because we are using Capstone that supports it (disassembly is done by Capstone and semantics by Triton).

Here is the list of supported x86_64 instructions :).

@alexsetsove
Copy link
Author

alexsetsove commented Aug 12, 2023

I changed the source to only do this:

input="junk data"
if(strstr(input,"empty string")!=NULL){
    cnt++;
}

Backward slice for the first 4 bytes of memory looks like this:

{1: (define-fun ref!1 () (_ BitVec 8) input_0[0]) ; Byte reference}
{3: (define-fun ref!3 () (_ BitVec 8) input_0[1]) ; Byte reference}
{5: (define-fun ref!5 () (_ BitVec 8) input_0[2]) ; Byte reference}
{7: (define-fun ref!7 () (_ BitVec 8) input_0[3]) ; Byte reference}

P.S: If you've gotten emails from Github for editing messages, I'm sorry. I got confused and don't know what I did wrong!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants