Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a footer in FileEncoder and check for it in MemDecoder #124686

Merged
merged 2 commits into from May 22, 2024

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented May 4, 2024

We have a few reports of ICEs due to decoding failures, where the fault does not lie with the compiler. The goal of this PR is to add some very lightweight and on-by-default validation to the compiler's outputs. If validation fails, we emit a fatal error for rmeta files in general that mentions the path that didn't load, and for incremental compilation artifacts we emit a verbose warning that tries to explain the situation and treat the artifacts as outdated.

The validation currently implemented here is very crude, and yet I think we have 11 ICE reports currently open (you can find them by searching issues for 1002111927320821928687967599834759150) which this simple validation would have detected. The structure of the code changes here should permit the addition of further validation code, such as a checksum, if it is merited. I would like to have code to detect corruption such as reported in #124719, but I'm not yet sure how to do that efficiently, and this PR is already a good size.

The ICE reports I have in mind that this PR would have smoothed over are:
#125160
#124469
#123352
#123376 1
#99763
#93900.


Footnotes

  1. This one might be a compiler bug, but even if it is I think the workflow described is pushing the envelope of what we can support. This issue is one of the reasons this warning still asks people to file an issue.

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 4, 2024
@saethlin saethlin force-pushed the rust-file-footer branch 2 times, most recently from 9697415 to 36f52a4 Compare May 5, 2024 21:29
@saethlin
Copy link
Member Author

saethlin commented May 6, 2024

@bors try @rust-timer queue
r? compiler

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 6, 2024
@saethlin saethlin marked this pull request as ready for review May 6, 2024 03:43
@bors
Copy link
Contributor

bors commented May 6, 2024

⌛ Trying commit 36f52a4 with merge 52bca7c...

bors added a commit to rust-lang-ci/rust that referenced this pull request May 6, 2024
Add a footer in FileEncoder and check for it in MemDecoder

We have a few reports of ICEs due to decoding failures, where the fault does not lie with the compiler. The goal of this PR is to add some very lightweight and on-by-default validation to the compiler's outputs. If validation fails, we emit a fatal error for rmeta files in general that mentions the path that didn't load, and for incremental compilation artifacts we emit a verbose warning that tries to explain the situation and treat the artifacts as outdated.

The validation currently implemented here is very crude, and yet I think we have 11 ICE reports currently open (you can find them by searching issues for `1002111927320821928687967599834759150` which this simple validation would have detected. The structure of the code changes here should permit the addition of further validation code, such as a checksum, if it is merited. I would like to have code to detect corruption such as reported in rust-lang#124719, but I'm not yet sure how to do that efficiently, and this PR is already a good size.

The ICE reports I have in mind that this PR would have smoothed over are:
rust-lang#124469
rust-lang#123352
rust-lang#123376 [^1]
rust-lang#99763
rust-lang#93900.

---

[^1]: This one might be a compiler bug, but even if it is I think the workflow described is pushing the envelope of what we can support
@bors
Copy link
Contributor

bors commented May 6, 2024

☀️ Try build successful - checks-actions
Build commit: 52bca7c (52bca7c0d2cd48856809616ffeab8aeb82ff04be)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (52bca7c): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.1% [2.5%, 3.8%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.5% [-2.9%, -2.1%] 2
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Bootstrap: 676.339s -> 676.639s (0.04%)
Artifact size: 315.76 MiB -> 316.00 MiB (0.08%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 6, 2024
Copy link
Member

@fmease fmease left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay! The changes look good overall, although I'm only roughly familiar with rustc_{serialize,metadata} and might've missed some things like invariants being broken that I don't know of. I've left some minor suggestions and questions.

r=me with nits & questions addressed unless you'd like to see someone else do another round of reviews.

compiler/rustc_driver_impl/src/lib.rs Outdated Show resolved Hide resolved
compiler/rustc_driver_impl/src/session_diagnostics.rs Outdated Show resolved Hide resolved
@@ -53,6 +52,16 @@ impl std::ops::Deref for MetadataBlob {
}
}

impl MetadataBlob {
pub fn new(slice: OwnedSlice) -> Option<Self> {
if MemDecoder::new(&*slice, 0).is_some() { Some(Self(slice)) } else { None }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if MemDecoder::new(&*slice, 0).is_some() { Some(Self(slice)) } else { None }
MemDecoder::new(&*slice, 0).map(|_| Self(slice))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work, because slice stays borrowed until the end of the statement, so we can't move out of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted the version you're suggesting here at first, and forgot about it until I tried to apply your suggestion 😂

compiler/rustc_metadata/src/rmeta/decoder.rs Outdated Show resolved Hide resolved
compiler/rustc_metadata/src/rmeta/decoder.rs Show resolved Hide resolved
compiler/rustc_serialize/src/opaque.rs Outdated Show resolved Hide resolved
@@ -25,13 +26,15 @@ macro_rules! impl_test_unsigned_leb128 {
let n = $write_fn_name(&mut buf, x);
stream.extend(&buf[..n]);
}
let stream_end = stream.len();
stream.extend(b"rust-end-file");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stream.extend(b"rust-end-file");
stream.extend(FOOTER);

with FOOTER being made crate-pub?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error[E0603]: constant `FOOTER` is private
  --> compiler/rustc_serialize/tests/leb128.rs:3:30
   |
3  | use rustc_serialize::opaque::FOOTER;
   |                              ^^^^^^ private constant
   |
note: the constant `FOOTER` is defined here
  --> /home/ben/rust/compiler/rustc_serialize/src/opaque.rs:20:1
   |
20 | pub(crate) const FOOTER: &[u8] = b"rust-end-file";
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Integration tests are a separate crate 🙃
I'll make it fully pub.

compiler/rustc_serialize/tests/leb128.rs Outdated Show resolved Hide resolved
@@ -17,6 +17,8 @@ use crate::int_overflow::DebugStrictAdd;

pub type FileEncodeResult = Result<usize, (PathBuf, io::Error)>;

const FOOTER: &[u8] = b"rust-end-file";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like naming this footer is a bit confusing since we already have the concept of file footers.

Copy link
Member Author

@saethlin saethlin May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the name to something like "magic bytes, but at the end". Maybe that's more accurate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, that does describe it pretty accurately.

debug!("position: {:?}", d.position());
let node_count = IntEncodedWithFixedSize::decode(d).0 as usize;
let edge_count = IntEncodedWithFixedSize::decode(d).0 as usize;
let graph_size = IntEncodedWithFixedSize::decode(d).0 as usize;
(node_count, edge_count, graph_size)
(node_count, edge_count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for dropping the graph_size check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The graph_size check was intended to detect this sort of issue. I wasn't sure what errors we'd get out of it, and experience has shown that nearly all reported ICEs are due to truncated files. So detecting truncation more deliberately which is what this PR does, seems more sensible.

@fmease fmease added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 18, 2024
@saethlin
Copy link
Member Author

@bors r=fmease

@bors
Copy link
Contributor

bors commented May 22, 2024

📌 Commit c3a6062 has been approved by fmease

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 22, 2024
@bors
Copy link
Contributor

bors commented May 22, 2024

⌛ Testing commit c3a6062 with merge 22f5bdc...

@bors
Copy link
Contributor

bors commented May 22, 2024

☀️ Test successful - checks-actions
Approved by: fmease
Pushing 22f5bdc to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 22, 2024
@bors bors merged commit 22f5bdc into rust-lang:master May 22, 2024
7 checks passed
@rustbot rustbot added this to the 1.80.0 milestone May 22, 2024
@saethlin saethlin deleted the rust-file-footer branch May 22, 2024 19:35
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (22f5bdc): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -4.0%, secondary -1.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.0% [-5.9%, -2.0%] 2
Improvements ✅
(secondary)
-1.7% [-1.7%, -1.7%] 1
All ❌✅ (primary) -4.0% [-5.9%, -2.0%] 2

Cycles

Results (secondary 2.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

Results (secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Bootstrap: 671.339s -> 670.997s (-0.05%)
Artifact size: 315.69 MiB -> 315.70 MiB (0.00%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ICE: cargo check panics on toolchain stable-x86_64-unknown-linux-gnu
5 participants