-
Notifications
You must be signed in to change notification settings - Fork 35.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
guix: remove bzip2 from deps #29895
guix: remove bzip2 from deps #29895
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
i'm divided on this conceptually, i think with the reputation it has now, another successful attack on xz specifically is unlikely (not more likely than any other library, at least). And i'm not sure we need to go out of our way to avoid it given how much of the FOSS ecosystem relies on it. One thing i am slightly afraid of is that in avoiding xz we become more sloppy about dependencies in other ways. Not saying that that is happening here. But e.g. not being able to use the primary source for a project anymore like for xkbcommon is something to think about. But maybe the github-generated tarball is actually better? Unlike the bootstrapped one it shouldn't. contain any extra files besides what's in the git repo, and we can check this. After all, the xz backdoor used one of these autotools "bonus" files to hide in.
That said i do like the elegance of having only one kind of input format for the guix builder. It's generally good to drop dependencies from guix. |
Not sure about the benefit to make extra steps to use a possibly non-standard, possibly easier to backdoor, mirrored tarball. An alternative might be to completely skip the step of tarballs and instead use the upstream source control directly? |
Yea. xz was a way to start a discussion. It could turn out that we actually consolidate everything to xz, and actually drop bzip2 for example.
If anything, I'd hope it's less easy to backdoor, because it's just the source code, and not all the bootstrapped garbage we don't need. This is using the tag from the official upstream source control. |
Mind that we do verify the sha256 hash. The person introducing it has to check, once, that the tarball matches exactly what is in git (and reviewers can verify this). i don't think there's a win in making every guix buid do a git (or other scm) checkout of all dependencies. That leaves more of a network attack surface too and makes it harder to mirror safely, not easier. |
I think it could make sense to specifically mention this on bumps, going forward. Maybe there could even be a new target written in depends to do the check? |
Oh, I see. So the motivation is to drop, possibly non-determinisically created, "bootstrapped garbage"? Concept +1 on this. I am just not sure on dropping |
7cd2049
to
205c1f5
Compare
Something a bit more concrete. We could consolidate all packages in depends to .tar.gz (which is also what we use for releases), and drop
We could also start doing this now, but it may be somewhat cleaner to do when packages are using CMake, so if reviewers want we could look at switching any of the open CMake depends PRs to downloading unbootstrapped tarballs. |
Concept ACK on consolidating archive formats. |
Again, I don't understand what the goal is. I doubt that removing xz as a dependency is doable. For example, gcc seems to be fetched as a xz archive (https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/gcc.scm#n803)? Even if it wasn't, I am sure there are other guix packages that depend on xz. |
The goal is to remove things we don't use. If we don't need bzip2, then I don't see a reason to keep it in our build release environment. Same reasoning for any other dep. Also, just because some thing may be used somewhere in the Guix bootstrap chain, isn't a reason to explicitly include it in our release env unnecessarily. |
Concept ACK. From your commits I gather that most stuff we use, provides Only QT and libxkbcommon require fetching code from Github. If they ever move away from Github to a platform that doesn't automatically generate
This could make sense, but the fact that this source is almost always Github makes me a bit uneasy. OTOH that's why we have
Also concept ACK. |
205c1f5
to
b8e084b
Compare
The current set of the changes look fine. However, to actually review all 10 changed packages, it would be nice to have a script that prints whether the tarball has any differences from the upstream source control it claims to come from. |
Changed to drop the xz changes, and just switch tar.bzip2 tarballs to using tar.gz (same as release tarballs), so we can drop bzip2 as a Guix dep.
Sure, we can have a script. Note that the current tarballs will already differ in that sense. |
i expect the This, along with seeing if the guix binary output is the same (apart from version markers) before and after this PR. |
Always some surprises... #!/usr/bin/env python3
from os import path
import os
from subprocess import run
import bz2
import gzip
GIT='git'
TMPDIR='tmp_checkout'
BRANCH='remove_xz_guix'
BASE_COMMIT='c05c214f2e9cfd6070a3c7680bfa09358fd9d97a'
def compare_streams(f, g, bsize=1024*1024):
while True:
b1 = f.read(bsize)
b2 = g.read(bsize)
if b1 != b2:
return False
if len(b1) == 0 and len(b2) == 0: # Both at EOF
break
return True
run([GIT, 'clone', '--depth', '20', '--single-branch', '-b', BRANCH, 'https://github.com/fanquake/bitcoin.git', TMPDIR], check=True)
os.chdir(path.join(TMPDIR, 'depends'))
# download and checksum packages from PR
run([GIT, 'checkout', BRANCH], check=True)
run(['make', 'download'], check=True)
# download and checksum packages from base commit
run([GIT, 'checkout', BASE_COMMIT], check=True)
run(['make', 'download'], check=True)
# check compressed data
SOURCES=[
'boost_1_81_0.tar',
'qrencode-4.1.1.tar',
'libXau-1.0.9.tar',
'xproto-7.0.31.tar',
'fontconfig-2.12.6.tar',
'xcb-util-0.4.0.tar',
'xcb-util-renderutil-0.3.9.tar',
'xcb-util-keysyms-0.4.0.tar',
'xcb-util-image-0.4.0.tar',
'xcb-util-wm-0.4.1.tar'
]
fail = False
for source in SOURCES:
print(f'Checking {source}')
with bz2.open(path.join('sources', source) + '.bz2', 'rb') as f, \
gzip.open(path.join('sources', source) + '.gz', 'rb') as g:
if not compare_streams(f, g):
print(f'Mismatch for {source}')
fail = True
if not fail:
print('Passed checks')
else:
print('There were failures') Output:
Will check the qrencode one... |
@laanwj did you mean I also get a difference on qrencode when running this on macOS 13.6.6 with Python 3.12.1. |
Contents are the same, there is a slight difference in the metadata:
A two-second difference in the times for four of the directories!
Well it's weird but i guess we can let this slide, for this time 😄
Fixed, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared the binaries. bitcoin-qt
has three differences:
_Z17FormatFullVersionB5cxx11v
- version string in
.rodata
section - checksum in
.debug_info
section
This is fully expected (no weird changes from dropping bzip2 from the build env). ACK b8e084b
This moves packages in depends that use
.tar.bzip2
to.tar.gz
(which is what we use for our own release tarballs). Doing so means we can dropbzip2
from our Guix release env. You can observe that Guix building master without it would currently fail:FORCE_DIRTY_WORKTREE=1 ./contrib/guix/guix-build
Guix Build: