Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit the number of attachments when parsing emails #88

Open
jorangreef opened this issue Aug 28, 2018 · 6 comments
Open

Limit the number of attachments when parsing emails #88

jorangreef opened this issue Aug 28, 2018 · 6 comments

Comments

@jorangreef
Copy link

This may be a non-issue.

I tried to install salmon to test this for myself but ran into No module named docutils.core.

Does salmon limit the number of attachments allowed when parsing emails?

Please see: https://snyk.io/blog/how-to-crash-an-email-server-with-a-single-email/

@moggers87
Copy link
Owner

I'm not sure how you ended up with that import error, Salmon doesn't import docutils at any point.

To answer your second question: no, I don't think it does. Although Salmon is far more lazy about loading things that it used to be, it still relies on Python's built in email package which loads everything up front.

@moggers87
Copy link
Owner

moggers87 commented Aug 28, 2018

I've played around with a test email that has about 17 million lines of --0. The result is around 25 seconds of parsing via email.message_from_file (Salmon uses threads, so we don't need to worry about blocking, probably) and very little RAM usage - which I found very surprising until I checked to see how long it would take to walk all those MIME parts:

>>> len([i for i in mails[0].walk()])
3

Which I guess is fine as you're meant to make sure boundaries are unique, right?

@jorangreef
Copy link
Author

Thanks for testing this @moggers87!

The result is around 25 seconds of parsing via email.message_from_file (Salmon uses threads, so we don't need to worry about blocking, probably)

The 25 seconds of parsing could be significant if the system has a limited number of CPU cores (say 4 or 8). With a handful of emails, sent within several seconds of each other, an attacker could use up all available CPU cores, while staying under the radar of any rate limiting system.

you're meant to make sure boundaries are unique

I might be mistaken but I think the --0 boundary (or any similar one character boundary) is technically a valid unique MIME boundary in terms of the RFCs. The uniqueness requirement within the entire message itself is only if multiparts are nested (multipart within a multipart as opposed to an array of parts):

RFC 2046 5.1. Multipart Media Type
The boundary delimiter MUST NOT appear inside any of the encapsulated
parts, on a line by itself or as the prefix of any line.  This implies that it is
crucial that the composing agent be able to choose and specify a
unique boundary parameter value that does not contain the boundary
parameter value of an enclosing multipart as a prefix.

So I think it's fine and expected to have multiple attachments at the same layer separated by the same boundary, it just can't be reused by encapsulated nested multiparts. Also, we wouldn't expect a malicious MIME message to play by the RFCs!

it still relies on Python's built in email package

Would you be willing to patch this upstream to add a maximum limit of 10,000 attachments?

My python is non-existent, I think the No module named docutils.core was just a python/pip environment issue on my Mac.

@moggers87
Copy link
Owner

Would you be willing to patch this upstream to add a maximum limit of 10,000 attachments?

Neither of us have any evidence that 10,000 attachments is a problem. 25 seconds of parsing was for an email that was 69MB long and had 17 million lines of "--0".

@jorangreef
Copy link
Author

Neither of us have any evidence that 10,000 attachments is a problem.

Sure, I mean 10,000 attachments shouldn't be a problem. It's just an upper bound (a maximum limit which should be safe) and I think your upstream parser will handle that fine.

The problem is with not validating potentially hostile user data containing 4 million or 17 million empty attachments. A limit on the maximum number of attachments allowed should be in place to prevent an attacker from burning even a few seconds of CPU time, let alone 25 seconds.

For instance, @ronomon/mime (a parser I have worked on) can detect and reject a hostile 20MB or 69MB payload like this in 70ms, regardless of payload size. It's just a simple limit. Seems like it's worth having, but if you disagree, feel free to close the issue!

@moggers87
Copy link
Owner

I think you missed my point there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants