Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some additions #13

Open
SeanHeelan opened this issue Jun 8, 2017 · 8 comments
Open

Some additions #13

SeanHeelan opened this issue Jun 8, 2017 · 8 comments
Assignees

Comments

@SeanHeelan
Copy link

SeanHeelan commented Jun 8, 2017

This is awesome! Some minor additions (as a disclaimer, I worked on all of these projects, but I feel they're worth mentioning nonetheless as the first is, to the best of my knowledge, the first AEG system of its kind, while the second two are the first reverse engineering/symex hybrid tools of their kind). I would make the chances myself but I'm fairly certain I would break everything ;)

Sep 2009: Thesis publication: "Automatic Generation of Control Flow Hijacking Exploits for Software Vulnerabilities" - Utilised symbolic execution to generate exploits. The implementation was titled AXGEN.
(Authors: S. Heelan, D. Kroening @ University of Oxford)

Dec 6th 2010: Immunity Debugger v1.8 released, embedding a symbolic execution engine into Immunity Debugger for use during reverse engineering (http://seclists.org/dailydave/2010/q4/23) (Authors: P. Sole, S. Heelan @ Immunity Inc.)

Dec 2nd 2013: ILLITHID (also by Immunity, DARPA funded), a standalone tool focused on finding software vulnerabilities combining compositional symbolic execution, function annotations and manual reverse engineering https://lists.immunityinc.com/pipermail/dailydave/2013-December/000539.html (Authors: S. Heelan, P. Sole, R. Huizer @ Immunity Inc)

@SeanHeelan
Copy link
Author

Updated with author info

@enzet
Copy link
Owner

enzet commented Jun 9, 2017

Thank you! It is a great work! Don't know if I should create additional color group for exploit generation tools to separate them from dynamic symbolic execution tools.

As for Immunity tools, I have to do a little research. Is there any more information about them?

@enzet enzet added the add label Jun 9, 2017
@enzet enzet self-assigned this Jun 9, 2017
@SeanHeelan
Copy link
Author

Unfortunately, at Immunity we weren't big on academic publications =/ Immunity Debugger, with the symbolic execution engine included, is publicly available and my slides which introduced it at Ruxcon 2010 can be found at [1] (Everything up to slide 82 is introductory material, so the ImmDbg + SymEx stuff starts there). In short, it was/is a symbolic execution engine embedded in Immunity Debugger which allowed one to solve a variety of different problems such as ROP chain construction, bug finding, taint tracking, figuring out inputs to reach a particular point and so on. Quite similar to angr/manticore in concept, but with the advantage of being integrated into a visual debugger and the disadvantage of having a lower level API than what they now have, and missing a few things like system call models.

ILLITHID was funded as part of DARPA's Cyber Fast Track program, intended to combine expert analysis with symbolic execution for the purposes of bug finding. At its core was a compositional symbolic execution engine (i.e. analysed functions seperately, summarised the results and then reused these results at callsites to previously analysed functions). It also allowed for a user to provide logical annotations manually via a GUI. To the best of my knowledge, it was never released publicly (I no longer work at Immunity). A video demonstration of it can be found here [2] which shows it finding bugs at that sort of thing, but unfortunately I don't think any public presentations were made on its internals.

[1] http://www.immunitysec.com/downloads/sean_ruxcon2010.pdf
[2] https://vimeo.com/80811284

enzet added a commit that referenced this issue Jun 10, 2017
enzet added a commit that referenced this issue Jun 10, 2017
@enzet enzet added the research label Jun 16, 2017
@enzet
Copy link
Owner

enzet commented Jun 29, 2017

Sorry for delay. I had to do some research. I want to clarify some key points I missed from the presentations. I consider two techniques:

  • static symbolic execution (I mean, converting binary code of some program part into the SMT formula),
  • and dynamic symbolic execution or concolic (building SMT formula of given execution path)?

Does Immunity Debugger implement only the first technique or both of them?

The same question is for ILLITHID. Function analysis looks like static symbolic execution. But does it use dynamic analysis techniques as well?

Thank you!

@SeanHeelan
Copy link
Author

They are both static, but they are different from each other.

I'll start with Immunity Debugger, as it is more straightforward. It performs path-wise static symbolic execution, in the sense that no information from concrete inputs is used but it is path-wise. i.e. similar in concept to angr or KLEE, rather than something like SAGE.

ILLITHID is unusual in that in terms of how it creates symbolic models it is closer to a model checking tool, like CBMC, than a path-wise symbolic execution tool like KLEE. It creates symbolic models that represent multiple paths, rather than just a single path. It is almost entirely static, although it can leverage concrete information to resolve things like pointer aliasing. As I mentioned, it performs compositional analysis at the function level by converting each function to a symbolic representation and computing a safety property for the function. The safety property can then be checked at all callsites, and this process is performed recursively from the leaves of the control flow graph to its root.

In short, ILLITHID is primarily static but can use dynamic information to resolve some problems. It's definitely not a concolic execution tool however.

@enzet
Copy link
Owner

enzet commented Jun 29, 2017

Thanks a lot. In the terms of KLEE and SAGE it is really clean for me now. I'm looking for more precise tool classification, so your explanation is very helpful.

@SeanHeelan
Copy link
Author

SeanHeelan commented Jun 30, 2017

With some of these tools a precise classification becomes difficult I think. There are at least a few engines that have hybrid algorithms between traditional symbolic execution and input-guided symbolic execution. e.g. compare ZESTI to KLEE. Same core engine, but the former follows paths taken on concrete inputs and then expands those paths symbolically, while the latter does 'normal' symbolic execution. Is ZESTI concolic or symbolic? Internally, all it does is fit a new path prioritisation algorithm to enable following of the 'concrete' path first. The lines are blurred for a lot of real projects I think.

The terms 'dynamic symbolic execution', 'static symbolic execution', 'symbolic execution' and 'concolic execution' are also unfortunately used by different people to mean different things. Again, classifying ZESTI and KLEE under these headers is complicated, but now consider QCE [1]. QCE is also built on top of KLEE and attempts to intelligently determine when to merge paths. Is this static or dynamic? What if it determines that no merges should take place (in which case it will be semantically equivalent to regular KLEE)?

Anyway, just some thoughts. Personally, I'm more in favour of using a set of descriptive attributes to categorise program analysis systems (e.g. Is loop unrolling supported? Does it ever merge paths? Can it follow the path resulting from a concrete input? etc) as I don't think static SE vs dynamic SE or concolic vs symbolic provides sufficient clarity.

[1] https://www.cs.rhul.ac.uk/home/kinder/papers/pldi12.pdf

@enzet
Copy link
Owner

enzet commented Jul 5, 2017

I understand your point. The initial goal of that project was not only to draw a timeline but to understand the whole picture of the symbolic execution world. But there were so many tools and techniques, I hardly knew where to start.

I know, even with some fundamental term “concolic” there is a problem. In some papers it is used as “constructing path constraint along with normal execution”. But sometimes it also includes the path alternation technique as well and used as particular path exploration approach. Even worse, many papers present hybrid approaches and combinations of different techniques.

I completely agree with you. The only way to have a strict classification is to use some kind of descriptive attributes. I already use tags to mark input languages and I’m going to use similar tags to describe main purpose of tools: defect detection, exploit generation, malware detection, etc.

But, again, to see the whole picture, we still need an approximate classification to separate totally different approaches, to highlight main streams and ideas. So, I want to define a couple of general groups to keep “similar” tools together. It doesn’t mean I beleive that we can define strict classification. But I beleive we can determine some basic ideas and try to classify other ideas as their combination or modification.

Thanks for mention ZESTI and QSE. I have to add them.

@enzet enzet added the SE label Dec 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants