Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the size limit for memory read and write #1799

Open
wants to merge 11 commits into
base: dev
Choose a base branch
from

Conversation

secretnonempty
Copy link

@secretnonempty secretnonempty commented Mar 7, 2023

This change eliminate the maximum size restriction for uc_mem_read and uc_mem_write. Which is required to support applications, such as LLVM CFI, that map or unmap memory blocks with sizes equal to or greater than INT_MAX.

uc_mem_unmap and uc_mem_protect would need to split the memory if they only operate on a portion of a memory block. The splitting procedure can be broken into several steps as follows:

  1. Reads the entire block of memory .
  2. Unmap the whole memory block.
  3. Remap the memory ranges that should not be unmapped.
  4. Write the data back to memory.

If the memory block is equal or larger than INT_MAX, then the first step would fail, which would eventually make uc_mem_unmap or uc_mem_protect fail.

linker_cfi.cpp is a specific example that would use memory block with a size equal to 2G(0x80000000 bytes) on a 64-bit platform.

Removing the restriction would allow us to emulate these applications with less pain.

I chose the MAX_RW_LENGTH purely out of my gut feeling, the macro would zero-out the least significant half of an INT_MAX:

#define MAX_RW_LENGTH ((INT_MAX >> (8*sizeof(int)/2)) << (8*sizeof(int)/2))

Technically, using INT_MAX directly should be fine.

@PhilippTakacs
Copy link

Your change will make the situation worse. Your change will silently truncate the len to MAX_RW_LENGTH. So you still limited about the length but you are not informed about the fail.

I'm currently look at the implementation of cpu_physical_memory_rw because the comment claim that it takes an int as length, but it actually takes a hwaddr.

@PhilippTakacs
Copy link

So looked at this a bit more and found #219 and #132 which explain the reasons. As far as I see this can changed now, but you missed some relevant points. For example uc->read_mem.

@secretnonempty
Copy link
Author

secretnonempty commented Mar 10, 2023

Your change will make the situation worse. Your change will silently truncate the len to MAX_RW_LENGTH. So you still limited about the length but you are not informed about the fail.

Thank you for reviewing! I am confused by your words, in my unit tests, this patch works as expected with memory blocks that are bigger than INT_MAX. Note that the len is a temporary variable, truncating it is completely okay since the length of the remaining data to read is calculated by size - count.

Here's the key snippet of code from uc_mem_read (with the patch applied):

    // memory area can overlap adjacent memory blocks                                                                                                                                                                
    while (count < size) {                                                                                                                                                                                           
        MemoryRegion *mr = memory_mapping(uc, address);                                                                                                                                                              
        if (mr) {                                                                                                                                                                                                    
            len = (size_t)MIN(size - count, mr->end - address);                                                                                                                                                      
            len = (size_t)MIN(len, MAX_RW_LENGTH);                                                                                                                                                                   
            if (uc->read_mem(&uc->address_space_memory, address, bytes, len) ==                                                                                                                                      
                false) {                                                                                                                                                                                             
                break;                                                                                                                                                                                               
            }                                                                                                                                                                                                        
            count += len;                                                                                                                                                                                            
            address += len;                                                                                                                                                                                          
            bytes += len;                                                                                                                                                                                            
        } else { // this address is not mapped in yet                                                                                                                                                                
            break;                                                                                                                                                                                                   
        }                                                                                                                                                                                                            
    }  

We can see if the memory block is bigger than INT_MAX, the loop will read multiple times until error occurs or count == size.

We can use the following code to test the patch:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <string.h>
#include <unicorn/unicorn.h>

int main(int argc, char *argv[]) {
  uc_engine * uc;
  uint64_t mem_addr = 0x1000000;
  size_t mem_size = 0x9f000000;
  size_t i;
  char buf[0x100];
  int len;
  char *pmem = NULL;

  uc_err err = uc_open(UC_ARCH_ARM64, UC_MODE_ARM, &uc);
  assert(err == UC_ERR_OK);

  err = uc_mem_map(uc, mem_addr, mem_size, UC_PROT_ALL);
  assert(err == UC_ERR_OK);

  pmem = calloc(1, mem_size);
  assert(pmem != NULL);
  for (i = 0; i < mem_size; i += 0x1000) {
    len = snprintf(pmem + i, 0x1000, "addr_%#zx", i);
    assert(len > 0);
  }
  // Test writing a large block of memory
  err = uc_mem_write(uc, mem_addr, pmem, mem_size);
  assert(err == UC_ERR_OK);

  memset(pmem, 'a', mem_size);

  // Test reading a large block of memory
  err = uc_mem_read(uc, mem_addr, pmem, mem_size);
  assert(err == UC_ERR_OK);

  for (i = 0; i < mem_size; i += 0x1000) {
    len = snprintf(buf, sizeof(buf), "addr_%#zx", i);
    if (memcmp(pmem + i, buf, len) != 0) {
      fprintf(stderr, "data does not match, expecting string \"%s\"\n", buf);
      abort();
    }
    printf("%s matches\n", buf);
  }

  free(pmem);
  err = uc_mem_unmap(uc, mem_addr, mem_size);
  assert(err == UC_ERR_OK);
  uc_close(uc);

  puts("Okay, test passed!");
  return 0;
}

Please be aware this program is intended to be compiled on a 64-bit platform. It will first write 0x9f000000 bytes to the virtual process, and then read it out to verify the data integrity. I picked the size at random, you can use another size that is bigger than INT_MAX. You can use the command as follows for compiling it:

gcc -o main main.c -lunicorn

It verified that the patch could work as expected.

The patch has also been tested with unicorn-1.0.3.

@secretnonempty
Copy link
Author

secretnonempty commented Mar 10, 2023

So looked at this a bit more and found #219 and #132 which explain the reasons. As far as I see this can changed now, but you missed some relevant points. For example uc->read_mem.

I am not sure what I have missed after reading the discussion, could you explain a bit more on this? In #219 and #132, I can see clearly why the limit was introduced. I guessed you were referring to @fabsx00's comment #132 (comment), am I right?

@secretnonempty
Copy link
Author

secretnonempty commented Mar 10, 2023

I'm currently look at the implementation of cpu_physical_memory_rw because the comment claim that it takes an int as length, but it actually takes a hwaddr.

Hmm, you are right, the upstream has changed the type of the len parameter of cpu_physical_memory_rw since 2019, the commit is unify len and addr type for memory/address APIs
. It looks we can now simply remove the limit directly.

In hwaddr.h, we can see that

typedef uint64_t hwaddr;

So that it appears we can refactor the uc_mem_* interfaces to align with the update.

@secretnonempty
Copy link
Author

secretnonempty commented Mar 10, 2023

I have updated the patch to utilize the update from Qemu, would very appreciate it if you can help review at your convenience. The latest patch works fine with the test program aforementioned.

Be aware that the original patch can work with Unicorn 1.x while the latest one can't.

uc.c Show resolved Hide resolved
@wtdcode
Copy link
Member

wtdcode commented Mar 12, 2023

I have updated the patch to utilize the update from Qemu, would very appreciate it if you can help review at your convenience. The latest patch works fine with the test program aforementioned.

Be aware that the original patch can work with Unicorn 1.x while the latest one can't.

Your changes are overall correct but this will involve more complex things as we are changing both API and ABI implicitly. Somethings you leave out:

  1. Update declarations in the unicorn.h, I believe your compiler should have told you how to do it.
  2. Update bindings. I could do this for you if you are not familiar with them but might be a bit slow.
  3. Add a few tests about your changes in test_mem. Note you will have to also make sure no memory contents are truncated.
  4. Targe dev branch.

Also note that uc1 will only accept really severe security patches and these breaking changes won't go to uc1 so don't worry about uc1.

Eliminate the maximum size restriction for uc_mem_read and uc_mem_write. This
change is required to support applications, such as LLVM CFI, that map or unmap
memory blocks with sizes equal to or greater than INT_MAX.
@secretnonempty
Copy link
Author

secretnonempty commented Mar 14, 2023

I have updated the patch to utilize the update from Qemu, would very appreciate it if you can help review at your convenience. The latest patch works fine with the test program aforementioned.
Be aware that the original patch can work with Unicorn 1.x while the latest one can't.

Your changes are overall correct but this will involve more complex things as we are changing both API and ABI implicitly. Somethings you leave out:

  1. Update declarations in the unicorn.h, I believe your compiler should have told you how to do it.
  2. Update bindings. I could do this for you if you are not familiar with them but might be a bit slow.
  3. Add a few tests about your changes in test_mem. Note you will have to also make sure no memory contents are truncated.
  4. Targe dev branch.

Also note that uc1 will only accept really severe security patches and these breaking changes won't go to uc1 so don't worry about uc1.

Thank you, I think the commit now is ready for review. I have updated the language bindings.

@wtdcode wtdcode added enhancement and removed stale labels Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants