Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content-app will fully stream a file to the user before doing any checksum verification #5012

Open
dralley opened this issue Feb 1, 2024 · 0 comments

Comments

@dralley
Copy link
Contributor

dralley commented Feb 1, 2024

Version

Any

Describe the bug

In the event that a file pointed to by a remote artifact is changed but left in place (i.e. doesn't 404) and no longer represents the original file (different checksums), the content app will still stream the file in its entirety back to the client before pulp verifies the checksum, which it does when saving the file to artifact storage.

An error message will be printed when this fails, but at this point, the wrong file has already been delivered.

Correct behavior in this situation is actually a bit ambiguous because the chance to download the original copy of the file is probably gone now that it has been replaced at the upstream source. And there's no HTTP status code for "checksum validation failed", so the best we could do might be HTTP status code 404 or 410.

To Reproduce

  1. Create a local repository from where pulp will sync. Ensure you will be able to make changes on the repository content.

Example of a small repository, easy to create on localhost:

mkdir /var/www/html/pub/repo           
cd /var/www/html/pub/repo
curl -O https://repos.fedorapeople.org/pulp/pulp/demo_repos/zoo/bear-4.1-1.noarch.rpm
createrepo .
  1. create a repository that will sync from http://localhost/pub/repo/ and sync it:

  2. Change the content of the rpmfile bear-4.1-1.noarch.rpm from the upstream (on localhost) repository. Ensure the checksum of the file changed!!

$ sha256sum /var/www/html/pub/repo/bear-4.1-1.noarch.rpm 
7a831f9f90bf4d21027572cb503d20b702de8e8785b02c0397445c2e481d81b3  /var/www/html/pub/repo/bear-4.1-1.noarch.rpm    <=== original checksum 

$ echo "CORRUPTED" >> /var/www/html/pub/repo/bear-4.1-1.noarch.rpm                                           <=== forcing a corruption of the file

$ sha256sum /var/www/html/pub/repo/bear-4.1-1.noarch.rpm 
1469d4fc47073fffbecf8e68419adf804de3c9c3645719779778fd0d634fca0e  /var/www/html/pub/repo/bear-4.1-1.noarch.rpm   <==== corrupted checksum
  1. Try downloading the file from the repository created in pulp:
$ curl -O $path_to_pulp_repo/Packages/b/bear-4.1-1.noarch.rpm
$ echo $?     <=== note that command returns no error

$ sha256sum bear-4.1-1.noarch.rpm         <====== checksum is something else

Actual results:

Client receives a bad file and no errors.

Pulpcore-content logs errors when saving the artifact:

Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]: Giving up download_wrapper(...) after 5 tries (pulpcore.exceptions.validation.DigestValidationError: A file located at the url http://localhost/pub/repo/bear-4.1-1.noarch.rpm failed validation due to checksum. Expected '7a831f9f90bf4d21027572cb503d20b702de8e8785b02c0397445c2e481d81b3', Actual '1469d4fc47073fffbecf8e68419adf804de3c9c3645719779778fd0d634fca0e')
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]: pulp [None]: backoff:ERROR: Giving up download_wrapper(...) after 5 tries (pulpcore.exceptions.validation.DigestValidationError: A file located at the url http://localhost/pub/repo/bear-4.1-1.noarch.rpm failed validation due to checksum. Expected '7a831f9f90bf4d21027572cb503d20b702de8e8785b02c0397445c2e481d81b3', Actual '1469d4fc47073fffbecf8e68419adf804de3c9c3645719779778fd0d634fca0e')
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]: [2024-01-31 16:10:08 +0000] [1141] [ERROR] Error handling request
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]: Traceback (most recent call last):
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib64/python3.9/site-packages/aiohttp/web_protocol.py", line 435, in _handle_request
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     resp = await request_handler(request)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib64/python3.9/site-packages/aiohttp/web_app.py", line 504, in _handle
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     resp = await handler(request)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib64/python3.9/site-packages/aiohttp/web_middlewares.py", line 117, in impl
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     return await handler(request)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/content/authentication.py", line 41, in authenticate
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     return await handler(request)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/cache/cache.py", line 339, in cached_function
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     response = await self.make_entry(
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/cache/cache.py", line 378, in make_entry
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     response = await handler(*args, **kwargs)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/content/handler.py", line 237, in stream_content
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     return await self._match_and_stream(path, request)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/content/handler.py", line 544, in _match_and_stream
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     return await self._stream_content_artifact(
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/content/handler.py", line 693, in _stream_content_artifact
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     response = await self._stream_remote_artifact(request, response, remote_artifact)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/content/handler.py", line 943, in _stream_remote_artifact
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     download_result = await downloader.run()
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/download/http.py", line 273, in run
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     return await download_wrapper()
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/backoff/_async.py", line 151, in retry
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     ret = await target(*args, **kwargs)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/download/http.py", line 258, in download_wrapper
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     return await self._run(extra_data=extra_data)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulp_rpm/app/downloaders.py", line 118, in _run
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     to_return = await self._handle_response(response)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/download/http.py", line 209, in _handle_response
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     await self.finalize()
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/content/handler.py", line 934, in finalize
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     await original_finalize()
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/download/base.py", line 171, in finalize
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     self.validate_digests()
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:   File "/usr/lib/python3.9/site-packages/pulpcore/download/base.py", line 223, in validate_digests
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]:     raise DigestValidationError(actual_digest, expected_digest, url=self.url)
Jan 31 11:10:08 satellite.example.com pulpcore-content[1141]: pulpcore.exceptions.validation.DigestValidationError: A file located at the url http://localhost/pub/repo/bear-4.1-1.noarch.rpm failed validation due to checksum. Expected '7a831f9f90bf4d21027572cb503d20b702de8e8785b02c0397445c2e481d81b3', Actual '1469d4fc47073fffbecf8e68419adf804de3c9c3645719779778fd0d634fca0e'

Expected behavior

Well, presumably if the upstream file has changed out from under us, then the chance to download the original is gone, so probably an HTTP status code 404 or 410?

Additional context
https://bugzilla.redhat.com/show_bug.cgi?id=2262110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants