Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined behavior in gcc-compatible detected by ubsan #327

Open
aep opened this issue Jul 14, 2020 · 11 comments
Open

undefined behavior in gcc-compatible detected by ubsan #327

aep opened this issue Jul 14, 2020 · 11 comments

Comments

@aep
Copy link

aep commented Jul 14, 2020

ubsan is unhappy about a bunch of things. i'm unsure if these are due to incorrect usage? The C output is a bit difficult to read.

3rdparty/hacl-star/dist/gcc-compatible/Hacl_Chacha20Poly1305_32.c:32:23: runtime error: applying zero offset to null pointer
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 3rdparty/hacl-star/dist/gcc-compatible/Hacl_Chacha20Poly1305_32.c:32:23 in 
3rdparty/hacl-star/dist/gcc-compatible/Hacl_Chacha20Poly1305_32.c:280:15: runtime error: null pointer passed as argument 2, which is declared to never be null
@beurdouche
Copy link
Contributor

The code is voluntarily not defensive, there is a precondition that is that you should not be passing a null pointer to that function. I think this should be better documented though.

@aep
Copy link
Author

aep commented Feb 22, 2021

i wasnt even are there is documentation for the C api. could you point me to it?

@beurdouche
Copy link
Contributor

https://hacl-star.github.io/HaclAEAD.html

@msprotz
Copy link
Contributor

msprotz commented Feb 22, 2021

@beurdouche I took a look and I couldn't see the non-null precondition here: did you find it somewhere? https://github.com/project-everest/hacl-star/blob/master/code/chacha20poly1305/Hacl.Impl.Chacha20Poly1305.fst#L151

I should have given an update on this: we took a long hard look at this issue back when you reported it, and actually found there is an issue: our model of C, Low*, allows the user to successfully perform NULL+0, on the basis that NULL is a zero-length array and that +0 is pointer arithmetic that is within bounds. Unfortunately, as you accurately found out, this is illegal in C.

Alas, this occurs everywhere an input may have length 0: it is idiomatic, even reasonable, for C callers to pass NULL when they want to have, say, no additional authenticated data. And it is not reasonable to ask C callers to heap-allocate a dummy just to call our functions. (Note: zero-length stack allocations are UB, so callers can't even do that as a dummy for the aad.)

Writing a defensive API would amount to checking whether aad has length zero, heap-allocating a dummy, calling the HACL functions, freeing the dummy, returning the result. This is clearly too much.

The "right" solution is to fix Low* and then fix HACL*. I had attempted to automate the Low* fix as much as I could on branch protz_null with meta-programming but that didn't work out as well as we hoped, and we went for something more low-key in buffer_null. This is all in the F* repository. On the HACL* side, Marina and I fixed a bunch of stuff (branches polubelova_null, protz_null) but this is such an extensive change that we ran out of manpower to bring it completion. Hence why this issue is still open.

I think there's still the hashes that ought to be fixed, and perhaps a couple more algorithms.

In the grand scheme of things, even though all UBs are bad, this one might be on the milder side. But still needs fixing, of course.

@beurdouche if you have cycles, getting Marina's branch up to speed and assessing what remains to be done would be quite high-impact. I think we were quite close, but unfortunately we all got distracted with other things.

Thanks,

Jonathan

@aep
Copy link
Author

aep commented Feb 22, 2021

thanks for the detailed response. i'd agree that it is idiomatic C to pass NULL for empty strings.
In fact that might have been the reason i so far couldn't figure out when this is happening. probably some buffer implementation internally just doesn't allocate anything if the content is length zero.

But i'm not sure if this specific api needs to be defensive. C is full of pitfalls like this, so anyone using it already knows to check the comments above a function or the manual. Adding a warning to the manual is fine.

Ideally of course, the function would be able to accept null and just do nothing. I guess that's what you said with the ongoing effort. Sounds great!

@msprotz
Copy link
Contributor

msprotz commented Feb 22, 2021

I agree that adding a disclaimer for now pointing to this issue would be good. Apologies once again for not responding: I was leading a big effort to try to make this happen, and I was hoping to just respond "it's fixed", but unfortunately it didn't work out as expected.

Internally, what typically happens is that there's something along the lines of (pseudo-code):

for (int i = 0; i < input_length/block_size; ++i) {
  uint8_t *block = input + block_size * i;
  // do stuff
}
uint8_t *incomplete_block = input + i * block_size;
// process final data

the second to last line is what triggers the UB. That's because our code is too generic, and now we need to add fast-paths everywhere for when the input length is zero, which is actually more work.

@polubelova probably remembers a little bit of this and perhaps give an update on what was mostly fixed and what could still use some work...

@msprotz
Copy link
Contributor

msprotz commented Mar 9, 2021

I found a copy of the message I had written on Slack -- apparently I had saved it somewhere knowing it would eventually disappear. I'm pasting a copy for reference to make sure we have everything in this discussion.

Some soundness issues related to the treatment and modeling of the NULL

pointer in Low*.

Background context.

We model the NULL pointer as an always-live buffer of size 0. I realize
that this is counter-intuitive. The intuition behind this is that live
captures the validity of a pointer, i.e. is this a valid object to pass
around and that I may receive. In that sense, NULL is a valid pointer,
and is the standard way to represent a pointer of size zero in C without
performing a dummy allocation. This is actually quite idiomatic: when
you call AEAD.encrypt with aad_len = 0, any reasonable C caller will
pass NULL for the aad pointer. So, if you are writing Low* code that
wants to receive pointers from C, you have to assume that if length == 0 is allowed for your array, then it's likely going to be NULL.

Consequence: since NULL will inevitably be passed around for any array
whose length may be zero, you cannot really write a precondition that
rules out NULL pointers:

  • this precondition is likely going to be violated by C callers
  • even if you write a defensive API that aborts if any argument is NULL,
    then C callers have no good way to use your API:
    • a C caller can call malloc(0): this is legal C, but the return
      value is implementation-defined and may be NULL on some
      implementations of malloc, so back to square one
    • a C caller cannot do a dummy stack allocation of size zero (illegal)
    • the only way, then, is for the caller to do a dummy allocation of
      size n > 0 and pass that as a "dummy" value, remembering to free it,
      or allocating it at the top-level -- hardly something that our users
      would be happy about
  • this precondition is not satisfiable by Low* clients, meaning that
    miTLS, EverQUIC and whatnot will not be able to call your code (in
    Low* you cannot allocate an array of size 0 other than by using
    LowStar.Buffer.null).

This means that the notion of valid pointer in Low* must encompass the
NULL pointer, and that's why the NULL pointer is live. End of rationale.

The issue.

Short version: having the NULL pointer wandering around your code causes
pain and misery, namely undefined behavior all over the place.

What came out as a surprise today (bug report:
#327) is that pointer
addition on the NULL pointer (i.e. NULL + 0) is undefined behavior.
(See thread:
https://everestexpedition.slack.com/archives/C4237009M/p1594737963341700).

What was previously known is that (see
https://www.imperialviolet.org/2016/06/26/nonnull.html) memcpy and
friends are UB if any of their arguments is NULL.

Since KreMLin generates C code that performs pointer addition, and uses
memcpy and memset, we need to enforce in Low* that none of the arguments
are NULL. Sadly, this was not done for historical reasons (the first
version of the NULL pointer was an afterthought on top of FStar.Buffer
("Low*v1", circa 2016), called C.Nullity, and when LowStar.Buffer was
authored with a built-in notion of Null, we didn't review the
consequences of allowing NULL on the signatures of various
LowStar.Buffer functions).

Proposed solution (Low*).

In branch protz_null of F*, any function foo from
LowStar.Monotonic.Buffer that may generate a pointer addition, a memcpy or
a memset in C has now been renamed to foo_non_null with a precondition
that none of the arguments be null. (This affects six functions.)

To avoid breaking clients, foo is now defined as an
inline_for_extraction wrapper around foo_non_null that checks whether
the arguments are null or not, with the same signature as before.

Proposed solution (clients, e.g. HACL*)

Naturally, this is going to generate quite a few ugly code-gen changes.
I'm currently in the process of getting a build for HACL* and commiting
the changes in dist/gcc-compatible so that others can review the diff.

What I suggest is to extend Lib.Buffer in HACL with a similar strategy.
A new sub_non_null function is for when you can statically derive that
the buffer is non-null: it suffices, for instances, to know that is has
a non-zero length. The existing sub remains the same.

The same thing can be done for other operations.

Individual algorithms can be fixed by either switching to
sub_non_null, or writing an early-exit, e.g.

if aad_len = 0ul then
  ()
else
  ... can use sub_non_null here ...

Thanks to @beurdouche, @R1kM for insightful discussions, and to
@tahina-pro and @aseemr for help fixing up LowStar.Monotonic.Buffer.

@beurdouche
Copy link
Contributor

Note, that because passing a null aad is quite frequent, I protected the call in #413 so that if the length is zero you won't have to pass a non-null pointer anymore. As Jonathan mentioned above, we are considering more global solutions to avoid the users to be confused by these issues.

@franziskuskiefer
Copy link
Member

I think there's still the hashes that ought to be fixed, and perhaps a couple more algorithms.

Yes, hashes are effected as well. I'll leave it to @msprotz whether this should get a hotfix as well.

https://github.com/project-everest/hacl-star/blob/b887e2edf70fecaeca0376efa3fe38c6234020bc/dist/gcc-compatible/Hacl_Hash_SHA2.c#L827

@msprotz
Copy link
Contributor

msprotz commented Dec 1, 2021

Yeah, I guess a hotfix for hashes would be good, too, actually.

Do you have an exhaustive list of all the places where this specific UB happens?

@franziskuskiefer
Copy link
Member

Do you have an exhaustive list of all the places where this specific UB happens?

Unfortunately not. Any place that might get a null pointer as input and tries to do something with it 😉
In the case of SHA2 B.sub is the culprit, i.e. https://github.com/FStarLang/FStar/blob/53cf588061b637a25105ad2ed55c4284c0ddab38/ulib/LowStar.Monotonic.Buffer.fst#L1279. This is different from the other occurrences that got a hot fix.
The easiest way to find all of them is probably throwing null pointer at all the APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants