Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Emulator Bug? #6404

Open
pjsoberoi opened this issue Apr 10, 2024 · 13 comments
Open

Possible Emulator Bug? #6404

pjsoberoi opened this issue Apr 10, 2024 · 13 comments
Assignees
Labels
Feature: Emulation Status: Triage Information is being gathered

Comments

@pjsoberoi
Copy link
Contributor

pjsoberoi commented Apr 10, 2024

While working on PR-5870, I think I stumbled across a possible Emulator bug. See the following snippet of SLEIGH:

        # A 176
	# op1 153

	#                  (176 & 15) + (153 & 15) + 1 = 10
	local tmp:4 = zext((A & 0x0F) + (op1 & 0x0F) + $(C_FLAG)); # 10 verified

	local add_tmp_6 = (tmp > 0x09); # verified add_tmp_6 is 1

	tmp = tmp + zext(add_tmp_6 * 0x6); # should be 16
	X = zext(add_tmp_6 * 0x6);
	Y = tmp:1; # Y is 24!!!! Should be 16

Everything is sane until I get to the tmp = tmp + zext(add_tmp_6 * 0x6) line. I have verified the values along the way and at this point add_tmp_6 = 1, and tmp = 10. This means tmp = tmp + zext(add_tmp_6 * 0x6) should be tmp = 10 + zext(1 * 0x6) which is tmp = 10 + 6 or tmp = 16. However I am getting tmp to be 24.

I am using the patch from #5262 on Ghidra 11.0.2. Thanks in advance.

@ryanmkurtz @GhidorahRex

@GhidorahRex
Copy link
Collaborator

I'm not certain about what bug might be occurring, but I did find a minor issue with your sleigh.

You should have add_tmp_6 be size 4. Otherwise the multiplication in add_tmp_6 * 0x6 can overflow a size 1 variable. I would do either:

local add_tmp_6:4 = zext(tmp > 0x09);
tmp = tmp + (add_tmp_6 * 0x6);

or

local add_tmp_6 = (tmp > 0x09);
tmp = tmp + zext(add_tmp_6) * 0x6;

@pjsoberoi
Copy link
Contributor Author

Thank you for the fast response. I made the requested changes. As predicted it did not affect my issue. I believe I have additional information to add:

	tmp = tmp + (add_tmp_6 * 6); # should be 16
	Y = tmp:1;

is incorrect with Y being 26 (I swear it was 24 yesterday....). This however is correct:

	local tmp2:4 = tmp + (add_tmp_6 * 6); # should be 16
	Y = tmp2:1;

with Y being 16 as expected.

So likely an issue with storing back to itself? Thanks again.

@nsadeveloper789
Copy link
Contributor

nsadeveloper789 commented Apr 17, 2024

I've just tried this (as best I can given the bits I see reported here) in the Java emulator from the UI, and it seems to be behaving properly. Since I don't have your slaspec at hand, I just imported 1K of 0s, then injected the following at address 0000 using a breakpoint in the Emulator:

local tmp:4 = zext((A & 0xF) + (153 & 0xf) + C);
local add_tmp_6 = (tmp > 0x09);

tmp = tmp + zext(add_tmp_6 * 0x6);
X = zext(add_tmp_6 * 0x6);
Y  = tmp:1;

It's the same as your original post, but with op1 hardcoded to 153, and $(C_FLAG) hardcoded to C. I then used the Registers panel to initialize A to 0xb0 (176) and C to 1. I then enabled the DebuggerPcodeStepperPlugin and single-stepped through it. I'll paste the resulting p-code here in case that eases diagnostics on your end. It's also in the screencap:

image

0		$U6000:1 = INT_AND A, 15:1
1		$U6080:1 = INT_AND 0x99:1, 15:1
2		$U6100:1 = INT_ADD $U6000:1, $U6080:1
3		$U6180:1 = INT_ADD $U6100:1, C
4		$U6280:4 = INT_ZEXT $U6180:1
5		$U6380:1 = INT_LESS 9:4, $U6280:4
6		$U6400:1 = INT_MULT $U6380:1, 6:1
7		$U6480:4 = INT_ZEXT $U6400:1
8		$U6280:4 = INT_ADD $U6280:4, $U6480:4
9		$U6580:1 = INT_MULT $U6380:1, 6:1
10		X = INT_ZEXT $U6580:1
11		Y = SUBPIECE $U6280:4, 0:4
12		(fall-through)

This was in the master branch with a couple patches to fix UI integration for emulated breakpoints. So, I see three possibilities:

  1. The issue happens to be fixed in our development branch (wishful thinking, I suspect.)
  2. The sleigh compiler is emitting different code than I got here. This could be a difference in version, or a discrepancy between the Java and C++ versions of the Sleigh compiler. Which are you using?
  3. The C++ emulator is interpreting the p-code incorrectly. Given context from your original post, I assume this is what you using? Is this also the one you used to verify your values?

FWIW, p-code op line 8 is where the "storing back to itself" happens.

If you are able to try this same experiment, but with the actual instruction and your slaspec (rather than a breakpoint injection), I'd be interested to see the results.

I can try in 11.0.2 later to rule out version differences, but I'm a bit pre-occupied at the moment.

@pjsoberoi
Copy link
Contributor Author

6502.zip

I am attaching a 6502 slaspec and sla that repro the issue. It is in the macro for adc on line 98.

  1. The issue happens to be fixed in our development branch (wishful thinking, I suspect.)

This is Ghidra 11.0.2, I will test in Ghidra 11.0.3.

  1. The sleigh compiler is emitting different code than I got here. This could be a difference in version, or a discrepancy between the Java and C++ versions of the Sleigh compiler. Which are you using?

I am using the C++ version of the Sleigh compiler on Ghidra 11.0.2.

  1. The C++ emulator is interpreting the p-code incorrectly. Given context from your original post, I assume this is what you using? Is this also the one you used to verify your values?

I am using the C++ emulator.

@pjsoberoi
Copy link
Contributor Author

11.0.3 has the same behavior as 11.0.2. The compiled .sla files are identical (built with /support/sleigh from both 11.0.2 and 11.03). I also rebuilt libsla.a and recompiled my verifier program.

@pjsoberoi
Copy link
Contributor Author

Here's the Pcode for ADC in 11.0.3:

f63e 61 e4 ADC (0xe4,X)
$U8180:1 = INT_ADD 0xe4:1, X
$U8280:1 = INT_ADD $U8180:1, 1:1
$U8380:2 = INT_ZEXT $U8180:1
$U8480:2 = INT_ZEXT $U8280:1
$U8500:1 = LOAD RAM($U8480:2)
$U8580:2 = INT_ZEXT $U8500:1
$U8600:2 = INT_LEFT $U8580:2, 8:4
$U8680:1 = LOAD RAM($U8380:2)
$U8700:2 = INT_ZEXT $U8680:1
$U8800:2 = INT_OR $U8600:2, $U8700:2
$U3280:1 = INT_RIGHT P, 3:4
$U3300:1 = INT_AND $U3280:1, 1:1
$U3380:1 = INT_EQUAL $U3300:1, 0:1
CBRANCH <0>, $U3380:1
$U3400:1 = INT_AND A, 15:1
$U8880:1 = LOAD RAM($U8800:2)
$U3480:1 = INT_AND $U8880:1, 15:1
$U3500:1 = INT_ADD $U3400:1, $U3480:1
$U3580:1 = INT_AND P, 1:1
$U3600:1 = INT_ADD $U3500:1, $U3580:1
$U3700:4 = INT_ZEXT $U3600:1
$U3800:1 = INT_LESS 9:4, $U3700:4
$U3880:1 = INT_MULT $U3800:1, 6:1
$U3900:4 = INT_ZEXT $U3880:1
$U3700:4 = INT_ADD $U3700:4, $U3900:4
X = INT_MULT $U3800:1, 6:1
Y = SUBPIECE $U3700:4, 0:4
BRANCH <1>
<0>
$U3b80:2 = INT_ZEXT A
test.zip

I'm also attaching a test.bin I made.

@pjsoberoi
Copy link
Contributor Author

I'm fumbling my way through the Emulator. But it looks like Y is being set correctly to 0x10:

image

So the issue is in the C++ emulator?

@nsadeveloper789
Copy link
Contributor

OK. In that case, I must bring in @caheckman.

@caheckman
Copy link
Contributor

I'm unable to reproduce the issue in the C++ emulator. If I emulate the following instructions using the provided 6502.slaspec

        f634 a2 00           LDX          #0x0
        f636 a9 99           LDA          #0x99
        f638 81 e4           STA          (0xe4,X)
        f63a a9 b0           LDA          #0xb0
        f63c f8              SED
        f63d 38              SEC
        f63e 61 e4           ADC          (0xe4,X)
        f640 60              RTS

It executes the following p-code for the ADC macro:

(unique,0x3400,1) = INT_AND (register,0x0,1) (const,0xf,1)             # 0 = 176 & 15
(unique,0x8880,1) = LOAD (const,0x5555556831f0,8) (unique,0x8800,2)    # 153 = load
(unique,0x3480,1) = INT_AND (unique,0x8880,1) (const,0xf,1)            # 9 = 153 & 15
(unique,0x3500,1) = INT_ADD (unique,0x3400,1) (unique,0x3480,1)        # 9 = 0 + 9
(unique,0x3580,1) = INT_AND (register,0x3,1) (const,0x1,1)             # 1 = 137 & 1
(unique,0x3600,1) = INT_ADD (unique,0x3500,1) (unique,0x3580,1)        # 10 = 9 + 1
(unique,0x3700,4) = INT_ZEXT (unique,0x3600,1)                         # 10 = zext(10)
(unique,0x3800,1) = INT_LESS (const,0x9,4) (unique,0x3700,4)           # 1 = 9 < 10
(unique,0x3880,1) = INT_MULT (unique,0x3800,1) (const,0x6,1)           # 6 = 1 * 6
(unique,0x3900,4) = INT_ZEXT (unique,0x3880,1)                         # 6 = zext(6)
(unique,0x3700,4) = INT_ADD (unique,0x3700,4) (unique,0x3900,4)        # 16 = 10 + 6
(register,0x1,1) = INT_MULT (unique,0x3800,1) (const,0x6,1)            # 6 = 1 * 6
(register,0x2,1) = SUBPIECE (unique,0x3700,4) (const,0x0,4)            # 16 = subpiece(16,0)

all of which seems correct and puts the value of 16 in Y as expected.

@pjsoberoi
Copy link
Contributor Author

@caheckman, thank you for the fast response. How are you testing? I would like to replicate your testing setup. Currently I am seeing the issue using my Verifier tool (https://github.com/oberoisecurity/ghidra-processor-module-verifier).

Does your .slaspec compile to an identical version of my .sla? Here is the exact state I am using:

./verifier -p PC -t 1 --max-failures 1 --register-map map.txt -s ~/ghidra-6502-fixes/Ghidra/Processors/6502/data/languages/6502.sla -j ~/ProcessorTests/6502/v1/61.json --start-test 16
Ghidra Processor Module Verifier (Verifier)
[*] Settings:
	[*] Compiled SLA file: /home/poberoi/ghidra-6502-fixes/Ghidra/Processors/6502/data/languages/6502.sla
	[*] JSON Test file: /home/poberoi/ProcessorTests/6502/v1/61.json
	[*] Program counter register: PC
	[*] Word size: 2
	[*] Register Mapping Count: 6
	[*] Max allowed failures: 1
	[*] Start test: 16
	[*] Register Mapping Count: 6
[*] /home/poberoi/ProcessorTests/6502/v1/61.json: Loaded 10000 test cases
[*] Test Range: 16-10000
!! REGISTER ERROR: X 8 6
!! REGISTER ERROR: Y 24 26
[-] 16) FAIL
Initial State:
	Registers:
		A: 176
		P: 105
		PC: 63038
		S: 217
		X: 8
		Y: 24
	RAM:
		228: 98
		236: 96
		237: 87
		22368: 153
		63038: 97
		63039: 228
		63040: 9

Final (Expected) State:
	Registers:
		A: 176
		P: 105
		PC: 63040
		S: 217
		X: 8
		Y: 24
	RAM:
		228: 98
		236: 96
		237: 87
		22368: 153
		63038: 97
		63039: 228
		63040: 9

Emulator:
	Registers:
		A: 176
		P: 105
		PC: 63040
		S: 217
		X: 6
		Y: 26
	RAM:
		228: 98
		236: 96
		237: 87
		22368: 153
		63038: 97
		63039: 228
		63040: 9

@pjsoberoi
Copy link
Contributor Author

pjsoberoi commented Apr 24, 2024

I've triaged the bug down to setValue not setting the value correctly. I added printfs() to setValue() before and after the set:

/// A convenience method for setting a value directly on a varnode rather than
/// breaking out the components
/// \param vn is a pointer to the varnode to be written
/// \param cval is the value to write into the varnode
inline void MemoryState::setValue(const VarnodeData *vn,uintb cval)

{
  printf("setValue: %p %x %x %x\n", vn->space,vn->offset,vn->size,cval);
  setValue(vn->space,vn->offset,vn->size,cval);

  cval = getValue(vn->space,vn->offset,vn->size);
  printf("-getValue: %p %x %x %x\n", vn->space,vn->offset,vn->size,cval);
}
setValue: 0x7f54200052c0 3700 4 a
-getValue: 0x7f54200052c0 3700 4 a
getValue: 0x7f5420005000 9 4 9
getValue: 0x7f54200052c0 3700 4 a
setValue: 0x7f54200052c0 3800 1 1
-getValue: 0x7f54200052c0 3800 1 1
getValue: 0x7f54200052c0 3800 1 1
getValue: 0x7f5420005000 6 1 6
setValue: 0x7f54200052c0 3880 1 6
-getValue: 0x7f54200052c0 3880 1 6
getValue: 0x7f54200052c0 3880 1 6
setValue: 0x7f54200052c0 3900 4 6
-getValue: 0x7f54200052c0 3900 4 6
getValue: 0x7f54200052c0 3700 4 a
getValue: 0x7f54200052c0 3900 4 6
setValue: 0x7f54200052c0 3700 4 10
-getValue: 0x7f54200052c0 3700 4 1a
getValue: 0x7f54200052c0 3800 1 1
getValue: 0x7f5420005000 6 1 6
setValue: 0x7f5420005530 1 1 6
-getValue: 0x7f5420005530 1 1 6
getValue: 0x7f54200052c0 3700 4 1a

Specifically here:

setValue: 0x7f54200052c0 3700 4 10
-getValue: 0x7f54200052c0 3700 4 1a

setValue is called with 0x10 (16), but then immediately calling getValue returns 0x1a (26).

Update: Too sleepy to debug this tonight, but the issue is in memstate.cc:MemoryBank::setValue(uintb offset,int4 size,uintb val). Specifically in the last else block:

    else {
      val1 &= (~((uintb)0ull)) >> 8*size1;
      val1 |= val << skip;
      printf("insert5: %llx %llx\n", ind, val1);
      insert(ind,val1);

      val2 &= (~((uintb)0ull)) << 8*size2;
      val2 |= val >> 8*size1;
      printf("insert6: %llx %llx\n", ind+wordsize, val2);
      insert(ind+wordsize,val2);
    }
  }

Val1 is set to 0x1a instead of 0x10. Will need to review this code to figure out what's going on.

@pjsoberoi
Copy link
Contributor Author

Swapping the shifts seems to have fixed my issue:

  else {
    if (size2 == 0) {
      val1 &= ~(calc_mask(size1)<<skip);
      val1 |= val << skip;
      insert(ind,val1);
    }
    else {
       val1 &= (~((uintb)0)) << 8*size1;
        val1 |= val << skip;
       insert(ind,val1);
        val2 &= (~((uintb)0)) >> 8*size2;
       val2 |= val >> 8*size1;
       insert(ind+wordsize,val2);
    }

That being said I don't understand this code at all to be making changes to it.

@pjsoberoi
Copy link
Contributor Author

@caheckman , checking in. Anything I can do to help you debug this issue? Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Emulation Status: Triage Information is being gathered
Projects
None yet
Development

No branches or pull requests

5 participants