Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing embedded JSON #64

Open
geekscrapy opened this issue Jan 27, 2019 · 13 comments
Open

Parsing embedded JSON #64

geekscrapy opened this issue Jan 27, 2019 · 13 comments
Labels
enhancement New feature or request

Comments

@geekscrapy
Copy link

Hi,

I have the following JSON format:

{"dst_host": "159.65.224.130", "dst_port": 23, "honeycred": false, "local_time": "2019-01-23 11:57:11.834296", "logdata": {"PASSWORD": "1111", "USERNAME": "root"}, "logtype": 6001, "node_id": "opencanary-1", "src_host": "41.139.253.2", "src_port": 36653}

How would I get agrind to also parse the logdata field?

TIA!

@rcoh
Copy link
Owner

rcoh commented Jan 27, 2019

Nested JSON isn't super well supported right now, but try: a
agrind '* | json | json from logdata:

echo '{"dst_host": "159.65.224.130", "dst_port": 23, "honeycred": false, "local_time": "2019-01-23 11:57:11.834296", "logdata": {"PASSWORD": "1111", "USERNAME": "root"}, "logtype": 6001, "node_id": "opencanary-1", "src_host": "41.139.253.2", "src_port": 36653}' 
| agrind '* | json | json from logdata | fields PASSWORD, USERNAME'
[PASSWORD=1111]            [USERNAME=root]

You can use the fields operator to drop the other fields if you want.

rcoh added a commit that referenced this issue Jan 27, 2019
@geekscrapy
Copy link
Author

geekscrapy commented Jan 27, 2019

Awesome, thanks!

Appreciate it's not trivial, however the main usecase for me would be to read JSON converted EVTX (windows logs) where there may (or may not be) multiple levels of json... Would be killer if that would be supported

@rcoh
Copy link
Owner

rcoh commented Jan 27, 2019

Yeah, makes sense. I assume you don't know the keys ahead of time? What would the ideal workflow be for you? Something like:

* | json | count by logdata.user?

@rcoh rcoh closed this as completed in #65 Jan 27, 2019
@rcoh rcoh reopened this Jan 27, 2019
@rcoh
Copy link
Owner

rcoh commented Jan 27, 2019

Keeping the issue open to discuss longer term improvements

@geekscrapy
Copy link
Author

Yea, the keys relate to the event id (and there are thousands of event ids....).

Btw you probably know, but that example JSON I gave earlier is a random log I had floating around, it's not a Windows EVTX. I could provide you one if you wanted (or you can just copy one from a Windows machine).

The original format of EVTX is binary XML, which is usually extracted using the following python library. This XML is usually then converted to a python dict, then to JSON format. So it's a bit of a pain, but if you could shortcut the conversion process, that'd be a massive win. There are very few tools that allow analysis of EVTX on Linux. But maybe there is a good reason for this and I'm missing it 😂

williballenthin/python-evtx#47

@rcoh
Copy link
Owner

rcoh commented Jan 28, 2019 via email

@geekscrapy
Copy link
Author

No that's fine, I was more hoping for either nested XML or JSON. We only really need one of those and it'll be covered.

Adding an option to redirect seems a little more complicated than it needs to be I think. cat works fine 😀

@rcoh
Copy link
Owner

rcoh commented Jan 28, 2019 via email

@rcoh rcoh added the enhancement New feature or request label Jun 3, 2019
@rcoh
Copy link
Owner

rcoh commented Jun 24, 2019

I think a good solution for this could be splatting JSON objects into 1-row-per KV like https://github.com/tomnomnom/gron -- @geekscrapy I'm curious if that would create a usable output for EVTX JSON

@geekscrapy
Copy link
Author

Hey! This may work (disclaimer: I've not played with gron, but it sounds like it should work)

@rcoh
Copy link
Owner

rcoh commented Jun 30, 2019

With #73 adding a splat operator should be all that's required to get decent support for arbitrary nested structures.

@Arch-vile
Copy link

I tried running the example you gave earlier (dropped some fields here for clarity):
echo '{"dst_host": "159.65.224.130", "logdata": {"PASSWORD": "1111", "USERNAME": "root"} }' | agrind '* | json | json from logdata | fields PASSWORD, USERNAME'

But end up with:
error: Expected string, found other

agrind --version
ag 0.18.0

@rcoh
Copy link
Owner

rcoh commented Nov 11, 2021

ah, that was probably written before proper nested field support was added.

You can now refer to logdata.PASSWORD and logdata.USERNAME directly. If you want to restructure things, you can do it like this:
* | json | logdata.PASSWORD as password | logdata.USERNAME as username | fields username, password:

echo '{"dst_host": "159.65.224.130", "logdata": {"PASSWORD": "1111", "USERNAME": "root"} }' | agrind '* | json | logdata.PASSWORD as password | logdata.USERNAME as username | fields username, password'
[password=1111]            [username=root]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants