Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syscalls linux32 #1170

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Syscalls linux32 #1170

wants to merge 6 commits into from

Conversation

Te-k
Copy link
Contributor

@Te-k Te-k commented Mar 30, 2020

Hi,

I have implemented a few more syscalls for Linux 32 (and some generic for Linux 64b too).
I am able to emulate several Linux 32b shellcodes with it, like :

[DEBUG   ]: socket(AF_INET, SOCK_STREAM, 0)
[DEBUG   ]: -> 3
WARNING: address 0x1240000 is not mapped in virtual memory:
[DEBUG   ]: socket_connect(fd, [AF_INET, 1234, 10.2.2.14], 102)
[DEBUG   ]: -> 0
Done

What do you think ?

Comment on lines +186 to +188
def read(self, count):
return b""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not a good idea to have a default read here. Maybe we can raise an error with pure abstract function, in order to for the user subclass this in order to implements it's own read.
See for example

raise NotImplementedError("Abstract method")

The user can subclass its own LinuxEnvironement and set a brand new self.network

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, but it seems hard to subclass, it means you have to implement a subclass of FileDescriptorSocket, Network and LinuxEnvironment, and make all this work together, right ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of. It will give something like:

class CustomFileDescriptorSocket(FileDescriptorSocket):
    def read(self, count):
        print("Turlututu")

class CustomNetworking(Networking):
    def socket(self, family, type_, protocol):
        fd = self.linux_env.next_fd()
        fdesc = CustomFileDescriptorSocket(fd, family, type_, protocol)
        self.linux_env.file_descriptors[fd] = fdesc
        return fd


class CustomLinuxEnvironment(LinuxEnvironment):
    def __init__(self):
        super(CustomLinuxEnvironment, self).__init__()
        self.network = CustomNetworking(self)

But maybe there is better: we could modify those classes to have a class variable which embed their needs. For example, for Networking:

class Networking(object):
    """Network abstraction"""

    fd_generator = FileDescriptorSocket
    def __init__(self, linux_env):
        self.linux_env = linux_env

    def socket(self, family, type_, protocol):
        fd = self.linux_env.next_fd()
        fdesc = self.fd_generator(fd, family, type_, protocol)
        self.linux_env.file_descriptors[fd] = fdesc
        return fd

So the "overhead" may just be:

class CustomNetworking(Networking):
    fd_generator = CustomFileDescriptorSocket

But I am not really sure if this is a suitable python pattern.
Or maybe Networking should take it's generator as init argument ?
It's a problem we already face in the SandBox object, which depends on os, arch, ...

@commial @p-l- I am interested if you have some feed on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the same problem than the multiple inheritance of Sandbox. IMHO, it looks like more what we've done in Jitcore with Cgen or SymbExecClass, which reflects your last proposal.

In my opinion, the question is "what we want to provides, and what customization should be reasonably easy to implements?".
I agree with the fact that it should be easy to modify what the socket returns, its state, etc. i'm not sure that the more global Networking part needs that kind of customization possibility.

A pattern we can use would be to provides a kind of "socket factory" (sorry for this word, but it is what it is) that the Network would use to creates its sockets.
It could be a function, taking as input the socket parameters and returning an instance with the socket "interface", ie. a subclass of the socket fd.
It could also be a class, taking as __init__ these parameters, and asked just after for successful creation or not (to keep the possibility to easiliy deny socket creation). I rather prefer the function solution, as it could be easier to return default implementation or several socket families implementation.

This "factory function" is then an attribute of the Networking class, and could be replaced with a dedicated function / property.

If this pattern become more frequent for the Linux kernel stub implementation, we could have a "config-like" class containing several factories functions, or hooks.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the socket factory would be an attribute of Network ?

Using it would be something like :

class CustomFileDescriptorSocket(FileDescriptorSocket):
    def read(self, count):
        print("Chapeau pointu")

env = environment.LinuxEnvironment_x86_32()
env.network.socket_class = CustomFileDescriptorSocket

Is that correct ?

while envp_addr != 0:
argv.append(jitter.get_c_str(envp_addr))
i += 4
argv_addr = jitter.vm.get_u32(jitter.cpu.EDX+i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's envp_addr here instead of argv_addr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

envp = []
i = 0
while envp_addr != 0:
argv.append(jitter.get_c_str(envp_addr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's envp instead of argv here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also yes

envp = []
i = 0
while envp_addr != 0:
argv.append(jitter.get_c_str(envp_addr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remarks here for argv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn copy paste

while envp_addr != 0:
argv.append(jitter.get_c_str(envp_addr))
i += 8
argv_addr = jitter.vm.get_u64(jitter.cpu.EDX+i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remarks here for argv_addr

raise NotImplemented()


def sys_generic_chmod(jitter, linux_env):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could really apply the chmod on the file located in the file sandbox file_sb ?

status, = jitter.syscall_args_systemv(1)
log.debug("sys_exit(%i)", status)
jitter.run = False
jitter.pc = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you don't need to set pc to 0 here

def sys_generic_setreuid(jitter, linux_env):
# Parse arguments
ruid, euid = jitter.syscall_args_systemv(2)
log.debug("sys_setreuid(%x, %x)", ruid, euid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use the current Linux env uid/euid/gid of the linux env here?

@@ -169,14 +176,22 @@ def sys_x86_32_socket(jitter, linux_env):
# socklen_t addrlen);
fd = jitter.vm.get_u32(jitter.cpu.ESP)
socklen = jitter.vm.get_u32(jitter.cpu.ESP+8)
# Not the exact size because shellcodes won't provide the full struct
sockaddr = jitter.vm.get_mem(jitter.vm.get_u32(jitter.cpu.ESP+4), 8)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using the socklen instead of a fixed length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, a shellcode would not have a full sockaddr struct but just the needed fields, what I have done in the next commit is getting the socklen and if it fails, I only get the first 8 bytes. Is that an ok trick to have it work ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, maybe we should behave like the kernel does so we will be close to a real environment.
If the kernel is ok with semi structures, maybe your patch is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a first get_mem for the full structure and fallback to 8 bytes if the memory if not that large, it should be close to what the kernel is doing I guess

miasm/os_dep/linux/syscall.py Show resolved Hide resolved
@Te-k
Copy link
Contributor Author

Te-k commented Mar 31, 2020

All good points, thanks. I will fix these later this week.

@serpilliere
Copy link
Contributor

Thank you for you PR @Te-k !
If you really want to be sure we won't break anything in the future, maybe we could add a regression test of one of your shellcode (if you can share them, obvisouly), but put it in the https://github.com/cea-sec/miasm-extended-tests repository. Those tests are currently executed by the Miasm travis file.
The reason is simple:
Some times ago, we put a shellcode directly in the main repository, and the travis environment has flagged Miasm as malware and refused to run regression tests.
Maybe we should definitively not commit any shellcode/malware in the main repo, as it may be flagged as malware by PIP or distributions.

Another reason is to not add too many weight to the main repo.

@Te-k
Copy link
Contributor Author

Te-k commented Apr 7, 2020

I have made some fix based on your suggestions, two are still unresolved :

  • Whether or not to implement read on sockets
  • Should it create a socket on connect ? (not sure why it would)

Just one warning : I have added a change on uid and euid in sys_generic_setreuid and it does not check for privileges to do that, should I implement privileges here ?

Let me know what you think

@Te-k
Copy link
Contributor Author

Te-k commented Apr 7, 2020

And I have added a script in the examples to emulate Linux shellcodes, which is needed to add test cases to miasm-extended-tests

@Te-k
Copy link
Contributor Author

Te-k commented Apr 7, 2020

And here is the PR for the test cea-sec/miasm-extended-tests#1 along with the update of travis config file (I have not tested it but it should be simple enough to work)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants