Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies between lifting IRs and physical CPU #705

Open
zyt755 opened this issue Mar 28, 2024 · 4 comments
Open

Inconsistencies between lifting IRs and physical CPU #705

zyt755 opened this issue Mar 28, 2024 · 4 comments

Comments

@zyt755
Copy link

zyt755 commented Mar 28, 2024

Hi, guys, several consistencies between lifting IRs and physical CPU are discovered while using.

1, In the case of the imul instruction, Remill resets both the AF and ZF flags to zero, while adjusting the PF and SF flags according to the results of the calculation. Contrarily, the physical CPU does not alter these four flags in the same way, but rather maintains the status established by the preceding add %r11, %ecx instruction.
2, In the case of sar, sal, shr, and shl instructions, Remill overlooks the effect on the AF flag. Conversely, the physical CPU does take this flag into account.

The following is the assembly code.

0000000000400504 <Block_1>:
400504: 41 c1 fa 1f sar $0x1f,%r10d
400508: 44 01 d9 add %r11d,%ecx

000000000040050b <Block_2>:
40050b: 48 0f af d0 imul %rax,%rdx
40050f: 48 c1 ea 1f shr $0x1f,%rdx

The following are IRs for the instruction 0x40050b imul %rax, %rdx.

%80 = call %struct.Memory* @breakpoint_40050b(%struct.Memory* %79)
call void @__mcsema_pc_tracer(i64 4195595)
store i64 add (i64 ptrtoint (i32 (i32, i8**, i8**)* @main to i64), i64 27), i64* @RIP_2472_2ba84c8, align 8
%81 = load i64, i64* @RDX_2264_2ba84c8, align 8
%82 = load i64, i64* @RAX_2216_2ba84c8, align 8
%83 = ashr i64 %81, 63
%84 = ashr i64 %82, 63
%L.sroa.2.0.insert.ext.i.i49 = zext i64 %83 to i128
%L.sroa.2.0.insert.shift.i.i50 = shl nuw i128 %L.sroa.2.0.insert.ext.i.i49, 64
%L.sroa.0.0.insert.ext.i.i51 = zext i64 %81 to i128
%L.sroa.0.0.insert.insert.i.i52 = or i128 %L.sroa.2.0.insert.shift.i.i50, %L.sroa.0.0.insert.ext.i.i51
%R.sroa.2.0.insert.ext.i.i53 = zext i64 %84 to i128
%R.sroa.2.0.insert.shift.i.i54 = shl nuw i128 %R.sroa.2.0.insert.ext.i.i53, 64
%R.sroa.0.0.insert.ext.i.i55 = zext i64 %82 to i128
%R.sroa.0.0.insert.insert.i.i56 = or i128 %R.sroa.2.0.insert.shift.i.i54, %R.sroa.0.0.insert.ext.i.i55
%mul.i.i57 = mul nsw i128 %R.sroa.0.0.insert.insert.i.i56, %L.sroa.0.0.insert.insert.i.i52
%retval.sroa.0.0.extract.trunc.i.i58 = trunc i128 %mul.i.i57 to i64
store i64 %retval.sroa.0.0.extract.trunc.i.i58, i64* @RDX_2264_2ba84c8, align 8, !tbaa !1219
%conv4.i.i.i59 = sext i64 %retval.sroa.0.0.extract.trunc.i.i58 to i128
%cmp.i.i.i60 = icmp ne i128 %mul.i.i57, %conv4.i.i.i59
%frombool.i.i61 = zext i1 %cmp.i.i.i60 to i8
store i8 %frombool.i.i61, i8* @CF_2065_2ba8480, align 1, !tbaa !1221
%x.sroa.0.0.insert.ext.i.i.i63 = trunc i128 %mul.i.i57 to i32
%conv.i.i.i.i64 = and i32 %x.sroa.0.0.insert.ext.i.i.i63, 255
%85 = call i32 @llvm.ctpop.i32(i32 %conv.i.i.i.i64) #16, !range !1235
%86 = trunc i32 %85 to i8
%87 = and i8 %86, 1
%88 = xor i8 %87, 1
store i8 %88, i8* @PF_2067_2ba8480, align 1, !tbaa !1236
store i8 0, i8* @AF_2069_2ba8480, align 1, !tbaa !1237
store i8 0, i8* @ZF_2071_2ba8480, align 1, !tbaa !1238
%res_trunc.lobit.i.i69 = lshr i64 %retval.sroa.0.0.extract.trunc.i.i58, 63
%89 = trunc i64 %res_trunc.lobit.i.i69 to i8
store i8 %89, i8* @SF_2073_2ba8480, align 1, !tbaa !1239
store i8 %frombool.i.i61, i8* @OF_2077_2ba8480, align 1, !tbaa !1240
@pgoodman
Copy link
Collaborator

This is probably the cause:

template <typename T, typename U, typename V>
ALWAYS_INLINE static void WriteFlagsMul(State &state, T lhs, T rhs, U res,
V res_trunc) {
const auto new_of = Overflow<tag_mul>::Flag(lhs, rhs, res);
FLAG_CF = new_of;
FLAG_PF = BUndefined(); // Technically undefined.
FLAG_AF = BUndefined();
FLAG_ZF = BUndefined();
FLAG_SF = BUndefined();
FLAG_OF = new_of;
}

@artemdinaburg
Copy link
Contributor

Adding some context to Peter's comments: According to the Intel Processor Manual (https://cdrdv2.intel.com/v1/dl/getContent/671110), for IMUL:
The SF, ZF, AF, and PF flags are undefined. (Page 3-503)

Where this becomes confusing is that operations that happen on real, physical CPUs for undefined flags sometimes feel very much defined in practice. The problem is that since these flags are officially undefined and documented as being undefined, the observed behavior is inconsistent across generations of CPU and CPUs from different manufacturers (e.g., AMD).

@pgoodman
Copy link
Collaborator

We should really have a one-argument form of __remill_undefined_8 that takes in a concrete value.

@pgoodman
Copy link
Collaborator

It looks like with the P4 core, IMUL started preserving some of the flags: https://www.sandpile.org/x86/flags.htm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants