Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wild Idea: Cryptography-safe integer operations and GC/Optimization-disabling "critical sections" #439

Open
kyanha opened this issue Feb 13, 2023 · 12 comments

Comments

@kyanha
Copy link

kyanha commented Feb 13, 2023

Cryptography is a funny beast, in that it must be implemented correctly -- not just "the math gets done to provide the correct result", but "the math must be done correctly to prevent side-channel information leakage". (c.f. Bleichenbacher's attack against RSA when integer comparisons are not constant-time, and Schnorr's algorithm to prevent observers from identifying whether a particular bit of an RSA modulus is 0 or 1 when implementing square+multiply multiplication, among many other examples.)

There are currently few implementations of Common Lisp which allow cryptographic primitives to be implemented in such a way as to mitigate and resist side-channel attacks, and none in the general case. (see the warning in https://github.com/sharplispers/ironclad#readme.)

So, I would like to suggest that CCL add "cryptographic side channel resistance" to its "Wild Ideas" list.

This would generally include:

  1. A constant-time EQUAL primitive (no shortcutting on "length not the same" or "a bit doesn't match"), even if not in the cl namespace
  2. A constant-time/constant-power * for integers, even if not in the cl namespace
  3. The ability to selectively disable optimization for particular sections of code which, to be side-channel resistant, must not be optimized
  4. The ability to selectively disable garbage collection during the processing of the aforementioned side-channel resistant sections of code
  5. Other things which I'm not enough of a cryptographic mathematics expert to be able to specify off the top of my head

I understand that this is not a current priority, and that is why I'm suggesting it be a "Wild Idea".

@kyanha kyanha changed the title Wild Idea: Cryptography-safe integer operations and GC/Optimization disabling "critical sections" Wild Idea: Cryptography-safe integer operations and GC/Optimization-disabling "critical sections" Feb 15, 2023
@moon-chilled
Copy link

Why would it matter if the gc runs?

@kyanha
Copy link
Author

kyanha commented Aug 15, 2023

If the GC runs, it can change the operating characteristics of the data the code is running against. Everything in cryptography needs to be as constant-time and constant-characteristic as possible to reduce side-channel attacks such as power analysis, bus analysis, and even Rowhammer-type attacks. As well, minimizing the number of places in memory where potentially sensitive cryptographic keys and temporal operational data can be found also helps efforts to isolate those portions of memory from core dumps.

Maintaining the operating characteristics is especially important on NUMA (Non-Uniform Memory Addressing) architectures, since it would be an easy optimization to move the data to a faster memory space.

@moon-chilled
Copy link

You've named a lot of security things, but not explained how they relate or presented a concrete attack.

'Constant time' doesn't mean that crypto routines always have to take the same amount of time. Rather, it means that the time taken by the crypto routine doesn't change when you change some sensitive data (such as the content of a message being encrypted). Under this definition, it seems to me that no reasonable gc would make a crypto routine not constant time, if it ran in the middle of it.

The only gc-related concern I can think of is securely erasing data in the face of a copying collector. Suppose sensitive information is initially placed at memory address A, and then the collector copies it to address B; under a naive design, if the application then wants to erase the data, it might erase the copy at B, leaving the copy at A where it is (until something else is allocated there, likely). This particular concern is less of an issue for lisp, being a nominally memory-safe language (the only way to access the data at A is with an illegal memory access), but is probably still worth coming up with a solution to (say, a 'secure array' type, which guarantees that, if the gc ever copies it, it will erase the original copy summarily).

@phoe
Copy link
Contributor

phoe commented Aug 15, 2023

(say, a 'secure array' type, which guarantees that, if the gc ever copies it, it will erase the original copy summarily).

https://github.com/sionescu/static-vectors - these are never moved around by the implementation. No copies, no problem.

@moon-chilled
Copy link

That works, but overconstrains the implementation for this problem.

@moon-chilled
Copy link

Having a secure array type would also let you tie in the guarantee of not dropping 'dead' stores, so you don't need a separate mechanism for that.

@kyanha
Copy link
Author

kyanha commented Aug 23, 2023

You've named a lot of security things, but not explained how they relate or presented a concrete attack.

The fundamental concept is that -any- deviation has the potential to lead to an attack vector.

'Constant time' doesn't mean that crypto routines always have to take the same amount of time. Rather, it means that the time taken by the crypto routine doesn't change when you change some sensitive data (such as the content of a message being encrypted). Under this definition, it seems to me that no reasonable gc would make a crypto routine not constant time, if it ran in the middle of it.

If the behavior of the gc changes based on when it actually performs the collection in a manner that's related to the secret information, for example if there's a number of temporary allocations based on the content of the key or message that cause the gc to run longer, that would be an information vector akin to Bleichenbacher's attack. The best way to prevent it is to simply eliminate the class of problem entirely.

The only gc-related concern I can think of is securely erasing data in the face of a copying collector. Suppose sensitive information is initially placed at memory address A, and then the collector copies it to address B; under a naive design, if the application then wants to erase the data, it might erase the copy at B, leaving the copy at A where it is (until something else is allocated there, likely). This particular concern is less of an issue for lisp, being a nominally memory-safe language (the only way to access the data at A is with an illegal memory access), but is probably still worth coming up with a solution to (say, a 'secure array' type, which guarantees that, if the gc ever copies it, it will erase the original copy summarily).

I'm certain you've heard of core dumps. Just because lisp itself is nominally memory-safe doesn't mean that the underlying characteristics of the operating system necessarily make it safe enough to rely on. Again, the best way to mitigate the possibility is to eliminate the class of problem entirely.

There is no actual trustworthy lisp implementation of cryptography. This is partly due to the fact that the people working with the language implementations don't understand just how pernicious these subtle and arcane issues are, and how they can possibly be leveraged against the implementations.

We need constant-time multiplication and exponentiation. We need constant-time byte array comparison. We need constant time addition. We need secure, constant-allocation, overwritable memory. These are primitives that languages other than lisp provide, and that the operating systems that CCL runs on provide the capacity for. Why don't we have them available in lisp? Because language implementors think we don't actually need them, and try to argue themselves out of having to provide them.

Please stop trying to argue the language out of having to provide what's being asked for.

@moon-chilled
Copy link

If the behavior of the gc changes based on when it actually performs the collection in a manner that's related to the secret information, for example if there's a number of temporary allocations based on the content of the key or message that cause the gc to run longer, that would be an information vector

If! Except that no gc actually behaves that way and there is no reason to believe that it would be profitable for one to behave in that way, so it makes far more sense to simply say that it won't.

There is no actual trustworthy lisp implementation of cryptography. This is partly due to the fact that the people working with the language implementations don't understand just how pernicious these subtle and arcane issues are

It's not just cryptography. Common lisp is also completely broken for, for example, any remotely serious uses of concurrency or floating point. For the very simple reason that it is a language that was last revised 30 years ago when many aspects of the world were different from today; and today, it has few serious users.

Please stop trying to argue the language out of having to provide what's being asked for

No. That is an absurd ask. You made a proposal, and I am debating the merits of that proposal, because I think it could be better.

@kyanha
Copy link
Author

kyanha commented Aug 23, 2023

If! Except that no gc actually behaves that way and there is no reason to believe that it would be profitable for one to behave in that way, so it makes far more sense to simply say that it won't.

A GC operates as the underlying allocations cause it to operate. I have seen (in other languages, runtimes, and core libraries) some incrediby boneheaded decisions for boxing, allocations, and dead objects. I'd really like there to exist at least one lisp implementation which is capable of freezing the underlying optimizer and garbage collector so that no matter how poorly the running code is written in terms of allocations, it can -always- have the same performance characteristics during the actual critical operations.

It's not just cryptography. Common lisp is also completely broken for, for example, any remotely serious uses of concurrency or floating point. For the very simple reason that it is a language that was last revised 30 years ago when many aspects of the world were different from today; and today, it has few serious users.

And yet it has arbitrarily-sized integers, which would generally lend themselves to cryptography if there were a way to make the performance characteristics constant.

The process of learning lisp is a particularly enlightening one. The prospect of exploring this enlightenment in the field of cryptography (and cryptanalysis) is a particularly attractive one. Having a constant-characteristic basis for such would reduce false-positive analysis/attack results, which would mean that lisp could once again be used for research in ways that Python currently is used.

No. That is an absurd ask. You made a proposal, and I am debating the merits of that proposal, because I think it could be better.

So make your own counter-proposal?

Right now, by your own admission, lisp is basically a joke. As you mentioned, it has few serious users. It's not used everywhere it could (or, in my own opinion, should) be. Nobody views it as a serious platform, and one of the very real reasons I can't view i tor use it as such is because of the issues I've mentioned.

Why don't you seem to want it to be actually useful as a serious environment, even for research?

@moon-chilled
Copy link

So make your own counter-proposal?

There is not much to counter-propose. I agree with most of the things you mentioned in your original message (though I haven't canvassed it thoroughly looking for missing or superfluous features or anything); I just think that the proposal to be able to turn off gc, if adopted, would cause a great deal of harm and no good.

A GC operates as the underlying allocations cause it to operate. I have seen (in other languages, runtimes, and core libraries) some incrediby boneheaded decisions for boxing, allocations, and dead objects. I'd really like there to exist at least one lisp implementation which is capable of freezing the underlying optimizer and garbage collector so that no matter how poorly the running code is written in terms of allocations, it can -always- have the same performance characteristics during the actual critical operations.

Again you speak in vapid platitudes rather than specifics. I don't think there is anything more constructive for me to say, and intend to disengage from this exchange until or unless that changes.

@se-mz
Copy link

se-mz commented Sep 17, 2023

If the behavior of the gc changes based on when it actually performs the collection in a manner that's related to the secret information, for example if there's a number of temporary allocations based on the content of the key or message that cause the gc to run longer, that would be an information vector akin to Bleichenbacher's attack.

Allocating a secret-dependent amount of memory is already almost guaranteed to leak since allocation times/methods will vary; accessing memory in a secret-dependent manner is also almost guaranteed to leak due to cache effects. This is before any GC comes into the picture and affects e.g. C all the same. Apologies for reviving an old discussion with such a nitpick, but I've been interested in this topic for a while and am also wondering why a GC would be a problem here.

@kyanha
Copy link
Author

kyanha commented Sep 29, 2023

In cryptography, the only safe way to proceed is to things in ways that do not relate to the data that you are working with, which means pessimization (the opposite of optimization). Constant-time comparisons: all bytes are compared, regardless of whether you know early that the comparison will fail. Large-integer multiplication using the square-multiply algorithm: do both into different memory locations, then use the version that was indicated by the value. The power draw must be constant, which is maximal. The comparison time must be constant, which is maximal. Even the memory usage must be constant within the implementation, which is pessimal for the implementation.

Constant-time also has another criterion: you can know for certain how long it will take, since you're already at the maximum. If you are designing a time-sensitive protocol, you already assume the pessimum for the primitive.

Asking for the GC to be disabled is a way to ensure that the cryptographic primitives have the constant-time property, so that they can be used in time-sensitive protocols.

(It's worth noting that newer garbage-collecting runtimes, such as the .Net Common Language Runtime version 4.7 and later, permit the running application to disable GC activity in critical regions. https://learn.microsoft.com/en-us/dotnet/api/system.gc?view=net-7.0 In their case, they speak of performance-critical sections of the application code, where UI responsiveness is incredibly important, or where GC interruption could be disastrous like in stock trading algorithms. In time-sensitive protocols, it is just as important to consider them as a form of user interface, with the 'user' being the protocol peer.)

While it may not be a component of cryptographic mathematics, it is an incredibly important part of cryptographic utility.

And since it would bring CCL into a more-modern way of thinking as far as garbage collection goes, I don't think it's too much to ask. Thus, it is a legitimate ask regardless. Make no mistake, though: while "disable GC" could be broken out into its own request, a modern cryptographic platform does require this capability.

(And, y'know, being able to know for certain that a primitive will not be subject to garbage collection pressure also provides for an excellent way to benchmark. But that's neither here nor there.)

cache effects

While it would be nice to reduce cache timing effects (which might be done by reading bytes from enough pages to flush the L2 caches of modern Intel processors, for example), Intel is not the only processor architecture that CCL can be compiled for or run on. I tend to think that the job of providing a constant-time primitive to flush L2 caches should in fact be delegated to the underlying compiler and standard library, and a single source for its truth to be established. (I also tend to think this would be the C compiler and the C standard library, either through FFI or directly with a primitive exposed through the implementation. This is, by the way, something that mitigations against SPECTRE-type attacks do, and the place where such mitigations are first implemented.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants