-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chacha20
is missing .zeroize()
for the SIMD backends
#336
Comments
I've begun working on the second option... however... with the 1.60 MSRV, |
We're about to start making breaking changes to all of the crates in this repo. I think it's fine for your PRs to assume that and an MSRV bump. |
I've made some progress with the second option, but even before adding What the branch that has decent performance does:
Some drawbacks of the
If you have any ideas for improving ProposalMayhaps we could use With a
The pub(crate) unsafe fn rng_inner<R>(state: &mut [u32; STATE_WORDS], mut dest_ptr: *mut u8, num_blocks: usize)
where
R: Rounds,
{
let mut backend = Backend::<R>::new(state);
// replace with some generic buffer initialization?
let mut results: [[__m256i; 4]; N] = [[_mm256_setzero_si256(); 4]; N];
backend.write_ks_blocks(dest_ptr, num_blocks, &mut results);
// replace this with a generic method?
state[12] = _mm256_extract_epi32(backend.ctr[0], 0) as u32;
#[cfg(feature = "zeroize")]
{
backend.zeroize();
results.zeroize();
}
} I feel like this could be okay. I could go ahead and bench this with |
I benched a new branch ( |
Had a bunch of benchmarks here, but TL; DR: option 2 is more desirable now that it is working |
Alright. Sorry for my complaining. I just didn't like that the first attempt at Option 2 resulted in Cipher's 1.6 cpb performance, even though it used pretty much the exact same code as before. I've been working on |
I wrote about it previously in different issues, but I think that in the case of "flat" types (i.e. types which do not reference "outside" memory) we can use the following implementation: impl Drop for Foo {
fn drop(&mut self) {
let n = core::mem::size_of::<Self>();
unsafe {
core::ptr::write_bytes(self, 0, n);
// blackbox `asm!`
}
}
} |
Would that be suitable for a And what of the |
|
Your ZeroizeOnDrop implementation seems to be far superior to a regular implementation of ZeroizeOnDrop. I've gone ahead and cleaned up my working branch a little and got it to pass some tests, but I'm not sure if I would be able to add that ZeroizeOnDrop impl on my own. I could either try to merge that branch with #333 or I could make a separate PR |
Even though I made code for this issue, it seems that it would be a wasted effort to rework the backends given that typical constructor methods result in a stack-allocated structure, such as any constructor that ends with: Self {
...
} With constructors returning that and trying to run There is a way to ensure that all allocated data stays on the heap using a type and macro such as #[cfg(feature = "alloc")]
type CfgBoxed<T> = Box<T>;
#[cfg(not(feature = "alloc"))]
type CfgBoxed<T> = T;
/// Defines a new instance of a data structure that is conditionally on the heap, based on whether the `alloc` feature is enabled.
#[macro_export]
#[cfg(feature = "alloc")]
macro_rules! cfg_new_boxed {
($data:expr) => {
$crate::Box::new($data)
};
}
/// Defines a new instance of a data structure that is conditionally on the heap, based on whether the `alloc` feature is enabled.
#[macro_export]
#[cfg(not(feature = "alloc"))]
macro_rules! cfg_new_boxed {
($data:expr) => {
$data
};
}
// and then use it in a constructor like so
pub struct Test {
a: u64,
b: [u32; 16]
}
impl Test {
pub fn new(value: &u64) -> CfgBoxed<Self> {
let mut result = cfg_new_boxed!(Self { a: 0, b: [0u32; 16] } );
result.a = *value;
result
}
} While this could work, every crypto crate would "need" to implement these types of constructor methods... but it would kind of be a waste of time because Part of the reason I wanted to consider this route is because the I'm fine if we close this issue—not all code is meant to make it to production. But if y'all somehow would still like code from the old branch I can see about working it into a new branch based on the current repo. |
The most straightforward way to impl it would be to add That wouldn't wipe all of the state that's left over on the stack, but that's not something we generally do for any of our cryptographic implementations.
Yep |
As I wrote above, I think a better solution will be to use the |
Yeah, I've been meaning to redo the |
|
I started to implement this, but in the process I reconsidered it and now agree with the @nstilt1 comment above. The backends live only on stack and no different from any other data spilled on stack. As we discussed in the traits issue, we do not provide any guarantees for spilled data (though we try to minimize spillage amount if possible) and zeroization (especially with its current implementation in So I think we can close this issue as "not planned". |
I've noticed that #333 is missing
zeroize
for the SIMD backends, and that thezeroize
crate seems to support SIMD registers. There are 2 ways that I can identify for incorporatingzeroize
. Both methods, however, would require the MSRV to be increased to1.60
.Method 1
The first method is kind of easy, as it requires a relatively small amount of code, but it is a little inefficient. Basically,
.zeroize()
could be called on the SIMD results arrays, as well as the state arrays after generating results.Pros:
Cons:
PAR_BLOCKS
blocks of output, it will need to zeroize the SIMD registersMethod 2
This would involve a little bit of a reimplementation of some features that
chacha20
previously had (persistingCore
structs viaautodetect.rs
andbackend.rs
). The persistingCore
structs can provide a few benefits:.zeroize()
could only be applied once to the SIMD registers, instead of every time theCore
generates resultsunsafe fn generate(&mut self, dest_ptr: *mut u8, num_blocks: usize)
to achieve a performance that is comparable with.apply_keystream()
on AVX2... unless.apply_keystream()
's performance also increases by 5-7%. The RNG could still benefit from using a pointer though.Cons:
Here's a link to
v0.8.1
for reference. I will need it if I will be adding the functionality back:https://github.com/RustCrypto/stream-ciphers/blob/338c078d731692fba3b8256e45de2c3e334d46d8/chacha20/src/backend.rs
The text was updated successfully, but these errors were encountered: