Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect interactive ppt features #786

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

christian-intra2net
Copy link
Contributor

Fix behaviour of olevba (i.e. ppt_parser) and oleobj for some real malware samples. Add detection for "new" (actually: very old) type of payload

@christian-intra2net
Copy link
Contributor Author

Rebased onto branch #769 , fixing unittests...

@christian-intra2net
Copy link
Contributor Author

christian-intra2net commented Oct 10, 2022

...done. Ready for merge at your disgression.

(Note: first two commits are #769 so I could unittest)

@christian-intra2net
Copy link
Contributor Author

Another note: could add sample from #784 to unittests, this branch should solve problems there. However, we have no actual malware in our test set yet (although we could add it as encrypted zip), not sure whether we want that.

@christian-intra2net
Copy link
Contributor Author

Found a bug in one of the commits. Fixed it and added unittests to avoid this in the future.
Cherry-picked one commit from PR #771 to help testing ("Add helper to temporarily...")

(new IDE complained about these)
Parse more record types from [MS-PPT] and some more from [MS-ODRAW],
show more info in __str__ for debugging and extending
Old powerpoint files (.ppt) can contain links to webpages or programs that
are neither ActiveX nor VBA nor other tested types. They are saved in
regular ppt-specific records and allow powerpoint to start arbitrary
commands upon click or hovering over some item. The tineout for "hovering"
is pretty fast here, it is very likely that users trigger this without
realizing it.

Add detection for these items to olevba
Expose existing param in record_base.test which is used to help extend and
debug record-based streams (currently only ppt_record_parser).
(1) Do not parse all sub-records in a container when constructing it

(2) Allow for stray bytes at end of container data
Had too harsh a requirement for ppt files, that it only contains root
streams and no sub-streams. Not sure whether this theoretically should be
true, but in any case it is not the case in real-world samples.
This is left-over from my initial attempt to parsing ppt documents. Never
worked properly.
This can easily be fooled as shown by some malware sample. So, do it the
pythonic way: try treating it like a zip file and deal with the exceptions
if it is not.
Do not remember potentially huge blobs in memory, need that just for
debugging.
Self-made sample that triggers default browser with an URL upon clicking a
shape, and that calls calc.exe upon hovering over another shape.
When testing json-output we need to run samples through the "main"
functions of modules, not just their "process_file" functions that would
accept the extracted and decrypted data from the existing helper
function "loop_over_files". They need a filename as input, so add helper
to create a temp dir and extract&decrypt samples to that temporarily.
When unzipping into temp dir, we often need to know the original sample
name.
In another branch I missed a bug that occurred in one of our test samples.
Avoid this by running all tools over all data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants