Possible Emulator Bug? #6404

pjsoberoi · 2024-04-10T04:35:06Z

While working on PR-5870, I think I stumbled across a possible Emulator bug. See the following snippet of SLEIGH:

        # A 176
	# op1 153

	#                  (176 & 15) + (153 & 15) + 1 = 10
	local tmp:4 = zext((A & 0x0F) + (op1 & 0x0F) + $(C_FLAG)); # 10 verified

	local add_tmp_6 = (tmp > 0x09); # verified add_tmp_6 is 1

	tmp = tmp + zext(add_tmp_6 * 0x6); # should be 16
	X = zext(add_tmp_6 * 0x6);
	Y = tmp:1; # Y is 24!!!! Should be 16

Everything is sane until I get to the tmp = tmp + zext(add_tmp_6 * 0x6) line. I have verified the values along the way and at this point add_tmp_6 = 1, and tmp = 10. This means tmp = tmp + zext(add_tmp_6 * 0x6) should be tmp = 10 + zext(1 * 0x6) which is tmp = 10 + 6 or tmp = 16. However I am getting tmp to be 24.

I am using the patch from #5262 on Ghidra 11.0.2. Thanks in advance.

@ryanmkurtz @GhidorahRex

The text was updated successfully, but these errors were encountered:

GhidorahRex · 2024-04-10T11:32:07Z

I'm not certain about what bug might be occurring, but I did find a minor issue with your sleigh.

You should have add_tmp_6 be size 4. Otherwise the multiplication in add_tmp_6 * 0x6 can overflow a size 1 variable. I would do either:

local add_tmp_6:4 = zext(tmp > 0x09);
tmp = tmp + (add_tmp_6 * 0x6);

or

local add_tmp_6 = (tmp > 0x09);
tmp = tmp + zext(add_tmp_6) * 0x6;

pjsoberoi · 2024-04-11T03:28:11Z

Thank you for the fast response. I made the requested changes. As predicted it did not affect my issue. I believe I have additional information to add:

	tmp = tmp + (add_tmp_6 * 6); # should be 16
	Y = tmp:1;

is incorrect with Y being 26 (I swear it was 24 yesterday....). This however is correct:

	local tmp2:4 = tmp + (add_tmp_6 * 6); # should be 16
	Y = tmp2:1;

with Y being 16 as expected.

So likely an issue with storing back to itself? Thanks again.

nsadeveloper789 · 2024-04-17T13:47:51Z

I've just tried this (as best I can given the bits I see reported here) in the Java emulator from the UI, and it seems to be behaving properly. Since I don't have your slaspec at hand, I just imported 1K of 0s, then injected the following at address 0000 using a breakpoint in the Emulator:

local tmp:4 = zext((A & 0xF) + (153 & 0xf) + C);
local add_tmp_6 = (tmp > 0x09);

tmp = tmp + zext(add_tmp_6 * 0x6);
X = zext(add_tmp_6 * 0x6);
Y  = tmp:1;

It's the same as your original post, but with op1 hardcoded to 153, and $(C_FLAG) hardcoded to C. I then used the Registers panel to initialize A to 0xb0 (176) and C to 1. I then enabled the DebuggerPcodeStepperPlugin and single-stepped through it. I'll paste the resulting p-code here in case that eases diagnostics on your end. It's also in the screencap:

0		$U6000:1 = INT_AND A, 15:1
1		$U6080:1 = INT_AND 0x99:1, 15:1
2		$U6100:1 = INT_ADD $U6000:1, $U6080:1
3		$U6180:1 = INT_ADD $U6100:1, C
4		$U6280:4 = INT_ZEXT $U6180:1
5		$U6380:1 = INT_LESS 9:4, $U6280:4
6		$U6400:1 = INT_MULT $U6380:1, 6:1
7		$U6480:4 = INT_ZEXT $U6400:1
8		$U6280:4 = INT_ADD $U6280:4, $U6480:4
9		$U6580:1 = INT_MULT $U6380:1, 6:1
10		X = INT_ZEXT $U6580:1
11		Y = SUBPIECE $U6280:4, 0:4
12		(fall-through)

This was in the master branch with a couple patches to fix UI integration for emulated breakpoints. So, I see three possibilities:

The issue happens to be fixed in our development branch (wishful thinking, I suspect.)
The sleigh compiler is emitting different code than I got here. This could be a difference in version, or a discrepancy between the Java and C++ versions of the Sleigh compiler. Which are you using?
The C++ emulator is interpreting the p-code incorrectly. Given context from your original post, I assume this is what you using? Is this also the one you used to verify your values?

FWIW, p-code op line 8 is where the "storing back to itself" happens.

If you are able to try this same experiment, but with the actual instruction and your slaspec (rather than a breakpoint injection), I'd be interested to see the results.

I can try in 11.0.2 later to rule out version differences, but I'm a bit pre-occupied at the moment.

pjsoberoi · 2024-04-19T02:55:35Z

6502.zip

I am attaching a 6502 slaspec and sla that repro the issue. It is in the macro for adc on line 98.

The issue happens to be fixed in our development branch (wishful thinking, I suspect.)

This is Ghidra 11.0.2, I will test in Ghidra 11.0.3.

The sleigh compiler is emitting different code than I got here. This could be a difference in version, or a discrepancy between the Java and C++ versions of the Sleigh compiler. Which are you using?

I am using the C++ version of the Sleigh compiler on Ghidra 11.0.2.

The C++ emulator is interpreting the p-code incorrectly. Given context from your original post, I assume this is what you using? Is this also the one you used to verify your values?

I am using the C++ emulator.

pjsoberoi · 2024-04-19T03:01:24Z

11.0.3 has the same behavior as 11.0.2. The compiled .sla files are identical (built with /support/sleigh from both 11.0.2 and 11.03). I also rebuilt libsla.a and recompiled my verifier program.

pjsoberoi · 2024-04-19T03:38:44Z

Here's the Pcode for ADC in 11.0.3:

f63e 61 e4 ADC (0xe4,X)
$U8180:1 = INT_ADD 0xe4:1, X
$U8280:1 = INT_ADD $U8180:1, 1:1
$U8380:2 = INT_ZEXT $U8180:1
$U8480:2 = INT_ZEXT $U8280:1
$U8500:1 = LOAD RAM($U8480:2)
$U8580:2 = INT_ZEXT $U8500:1
$U8600:2 = INT_LEFT $U8580:2, 8:4
$U8680:1 = LOAD RAM($U8380:2)
$U8700:2 = INT_ZEXT $U8680:1
$U8800:2 = INT_OR $U8600:2, $U8700:2
$U3280:1 = INT_RIGHT P, 3:4
$U3300:1 = INT_AND $U3280:1, 1:1
$U3380:1 = INT_EQUAL $U3300:1, 0:1
CBRANCH <0>, $U3380:1
$U3400:1 = INT_AND A, 15:1
$U8880:1 = LOAD RAM($U8800:2)
$U3480:1 = INT_AND $U8880:1, 15:1
$U3500:1 = INT_ADD $U3400:1, $U3480:1
$U3580:1 = INT_AND P, 1:1
$U3600:1 = INT_ADD $U3500:1, $U3580:1
$U3700:4 = INT_ZEXT $U3600:1
$U3800:1 = INT_LESS 9:4, $U3700:4
$U3880:1 = INT_MULT $U3800:1, 6:1
$U3900:4 = INT_ZEXT $U3880:1
$U3700:4 = INT_ADD $U3700:4, $U3900:4
X = INT_MULT $U3800:1, 6:1
Y = SUBPIECE $U3700:4, 0:4
BRANCH <1>
<0>
$U3b80:2 = INT_ZEXT A
test.zip

I'm also attaching a test.bin I made.

pjsoberoi · 2024-04-19T03:57:53Z

I'm fumbling my way through the Emulator. But it looks like Y is being set correctly to 0x10:

So the issue is in the C++ emulator?

nsadeveloper789 · 2024-04-22T14:36:44Z

OK. In that case, I must bring in @caheckman.

caheckman · 2024-04-23T18:21:59Z

I'm unable to reproduce the issue in the C++ emulator. If I emulate the following instructions using the provided 6502.slaspec

        f634 a2 00           LDX          #0x0
        f636 a9 99           LDA          #0x99
        f638 81 e4           STA          (0xe4,X)
        f63a a9 b0           LDA          #0xb0
        f63c f8              SED
        f63d 38              SEC
        f63e 61 e4           ADC          (0xe4,X)
        f640 60              RTS

It executes the following p-code for the ADC macro:

(unique,0x3400,1) = INT_AND (register,0x0,1) (const,0xf,1)             # 0 = 176 & 15
(unique,0x8880,1) = LOAD (const,0x5555556831f0,8) (unique,0x8800,2)    # 153 = load
(unique,0x3480,1) = INT_AND (unique,0x8880,1) (const,0xf,1)            # 9 = 153 & 15
(unique,0x3500,1) = INT_ADD (unique,0x3400,1) (unique,0x3480,1)        # 9 = 0 + 9
(unique,0x3580,1) = INT_AND (register,0x3,1) (const,0x1,1)             # 1 = 137 & 1
(unique,0x3600,1) = INT_ADD (unique,0x3500,1) (unique,0x3580,1)        # 10 = 9 + 1
(unique,0x3700,4) = INT_ZEXT (unique,0x3600,1)                         # 10 = zext(10)
(unique,0x3800,1) = INT_LESS (const,0x9,4) (unique,0x3700,4)           # 1 = 9 < 10
(unique,0x3880,1) = INT_MULT (unique,0x3800,1) (const,0x6,1)           # 6 = 1 * 6
(unique,0x3900,4) = INT_ZEXT (unique,0x3880,1)                         # 6 = zext(6)
(unique,0x3700,4) = INT_ADD (unique,0x3700,4) (unique,0x3900,4)        # 16 = 10 + 6
(register,0x1,1) = INT_MULT (unique,0x3800,1) (const,0x6,1)            # 6 = 1 * 6
(register,0x2,1) = SUBPIECE (unique,0x3700,4) (const,0x0,4)            # 16 = subpiece(16,0)

all of which seems correct and puts the value of 16 in Y as expected.

pjsoberoi · 2024-04-24T05:03:03Z

@caheckman, thank you for the fast response. How are you testing? I would like to replicate your testing setup. Currently I am seeing the issue using my Verifier tool (https://github.com/oberoisecurity/ghidra-processor-module-verifier).

Does your .slaspec compile to an identical version of my .sla? Here is the exact state I am using:

./verifier -p PC -t 1 --max-failures 1 --register-map map.txt -s ~/ghidra-6502-fixes/Ghidra/Processors/6502/data/languages/6502.sla -j ~/ProcessorTests/6502/v1/61.json --start-test 16
Ghidra Processor Module Verifier (Verifier)
[*] Settings:
	[*] Compiled SLA file: /home/poberoi/ghidra-6502-fixes/Ghidra/Processors/6502/data/languages/6502.sla
	[*] JSON Test file: /home/poberoi/ProcessorTests/6502/v1/61.json
	[*] Program counter register: PC
	[*] Word size: 2
	[*] Register Mapping Count: 6
	[*] Max allowed failures: 1
	[*] Start test: 16
	[*] Register Mapping Count: 6
[*] /home/poberoi/ProcessorTests/6502/v1/61.json: Loaded 10000 test cases
[*] Test Range: 16-10000
!! REGISTER ERROR: X 8 6
!! REGISTER ERROR: Y 24 26
[-] 16) FAIL
Initial State:
	Registers:
		A: 176
		P: 105
		PC: 63038
		S: 217
		X: 8
		Y: 24
	RAM:
		228: 98
		236: 96
		237: 87
		22368: 153
		63038: 97
		63039: 228
		63040: 9

Final (Expected) State:
	Registers:
		A: 176
		P: 105
		PC: 63040
		S: 217
		X: 8
		Y: 24
	RAM:
		228: 98
		236: 96
		237: 87
		22368: 153
		63038: 97
		63039: 228
		63040: 9

Emulator:
	Registers:
		A: 176
		P: 105
		PC: 63040
		S: 217
		X: 6
		Y: 26
	RAM:
		228: 98
		236: 96
		237: 87
		22368: 153
		63038: 97
		63039: 228
		63040: 9

pjsoberoi · 2024-04-24T05:23:17Z

I've triaged the bug down to setValue not setting the value correctly. I added printfs() to setValue() before and after the set:

/// A convenience method for setting a value directly on a varnode rather than
/// breaking out the components
/// \param vn is a pointer to the varnode to be written
/// \param cval is the value to write into the varnode
inline void MemoryState::setValue(const VarnodeData *vn,uintb cval)

{
  printf("setValue: %p %x %x %x\n", vn->space,vn->offset,vn->size,cval);
  setValue(vn->space,vn->offset,vn->size,cval);

  cval = getValue(vn->space,vn->offset,vn->size);
  printf("-getValue: %p %x %x %x\n", vn->space,vn->offset,vn->size,cval);
}

setValue: 0x7f54200052c0 3700 4 a
-getValue: 0x7f54200052c0 3700 4 a
getValue: 0x7f5420005000 9 4 9
getValue: 0x7f54200052c0 3700 4 a
setValue: 0x7f54200052c0 3800 1 1
-getValue: 0x7f54200052c0 3800 1 1
getValue: 0x7f54200052c0 3800 1 1
getValue: 0x7f5420005000 6 1 6
setValue: 0x7f54200052c0 3880 1 6
-getValue: 0x7f54200052c0 3880 1 6
getValue: 0x7f54200052c0 3880 1 6
setValue: 0x7f54200052c0 3900 4 6
-getValue: 0x7f54200052c0 3900 4 6
getValue: 0x7f54200052c0 3700 4 a
getValue: 0x7f54200052c0 3900 4 6
setValue: 0x7f54200052c0 3700 4 10
-getValue: 0x7f54200052c0 3700 4 1a
getValue: 0x7f54200052c0 3800 1 1
getValue: 0x7f5420005000 6 1 6
setValue: 0x7f5420005530 1 1 6
-getValue: 0x7f5420005530 1 1 6
getValue: 0x7f54200052c0 3700 4 1a

Specifically here:

setValue: 0x7f54200052c0 3700 4 10
-getValue: 0x7f54200052c0 3700 4 1a

setValue is called with 0x10 (16), but then immediately calling getValue returns 0x1a (26).

Update: Too sleepy to debug this tonight, but the issue is in memstate.cc:MemoryBank::setValue(uintb offset,int4 size,uintb val). Specifically in the last else block:

    else {
      val1 &= (~((uintb)0ull)) >> 8*size1;
      val1 |= val << skip;
      printf("insert5: %llx %llx\n", ind, val1);
      insert(ind,val1);

      val2 &= (~((uintb)0ull)) << 8*size2;
      val2 |= val >> 8*size1;
      printf("insert6: %llx %llx\n", ind+wordsize, val2);
      insert(ind+wordsize,val2);
    }
  }

Val1 is set to 0x1a instead of 0x10. Will need to review this code to figure out what's going on.

pjsoberoi · 2024-04-26T03:57:59Z

Swapping the shifts seems to have fixed my issue:

  else {
    if (size2 == 0) {
      val1 &= ~(calc_mask(size1)<<skip);
      val1 |= val << skip;
      insert(ind,val1);
    }
    else {
       val1 &= (~((uintb)0)) << 8*size1;
        val1 |= val << skip;
       insert(ind,val1);
        val2 &= (~((uintb)0)) >> 8*size2;
       val2 |= val >> 8*size1;
       insert(ind+wordsize,val2);
    }

That being said I don't understand this code at all to be making changes to it.

pjsoberoi · 2024-05-10T01:59:18Z

@caheckman , checking in. Anything I can do to help you debug this issue? Thanks in advance.

ryanmkurtz assigned nsadeveloper789 Apr 17, 2024

ryanmkurtz added Feature: Emulation Status: Triage Information is being gathered labels Apr 17, 2024

nsadeveloper789 assigned caheckman Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible Emulator Bug? #6404

Possible Emulator Bug? #6404

pjsoberoi commented Apr 10, 2024 •

edited

GhidorahRex commented Apr 10, 2024

pjsoberoi commented Apr 11, 2024

nsadeveloper789 commented Apr 17, 2024 •

edited

pjsoberoi commented Apr 19, 2024

pjsoberoi commented Apr 19, 2024

pjsoberoi commented Apr 19, 2024

pjsoberoi commented Apr 19, 2024

nsadeveloper789 commented Apr 22, 2024

caheckman commented Apr 23, 2024

pjsoberoi commented Apr 24, 2024

pjsoberoi commented Apr 24, 2024 •

edited

pjsoberoi commented Apr 26, 2024

pjsoberoi commented May 10, 2024

Possible Emulator Bug? #6404

Possible Emulator Bug? #6404

Comments

pjsoberoi commented Apr 10, 2024 • edited

GhidorahRex commented Apr 10, 2024

pjsoberoi commented Apr 11, 2024

nsadeveloper789 commented Apr 17, 2024 • edited

pjsoberoi commented Apr 19, 2024

pjsoberoi commented Apr 19, 2024

pjsoberoi commented Apr 19, 2024

pjsoberoi commented Apr 19, 2024

nsadeveloper789 commented Apr 22, 2024

caheckman commented Apr 23, 2024

pjsoberoi commented Apr 24, 2024

pjsoberoi commented Apr 24, 2024 • edited

pjsoberoi commented Apr 26, 2024

pjsoberoi commented May 10, 2024

pjsoberoi commented Apr 10, 2024 •

edited

nsadeveloper789 commented Apr 17, 2024 •

edited

pjsoberoi commented Apr 24, 2024 •

edited