Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support range requests #7160

Merged
merged 3 commits into from May 16, 2024
Merged

Support range requests #7160

merged 3 commits into from May 16, 2024

Conversation

micahflee
Copy link
Contributor

@micahflee micahflee commented May 9, 2024

Status

Ready for review

Description of Changes

Fixes #3907.

This adds a range request test, and also makes it so the two routes that we need to support range requests include the Accept-Ranges: bytes header. The routes are:

  • /sources/<source_uuid>/submissions/<submission_uuid>/download
  • /sources/<source_uuid>/replies/<reply_uuid>/download

Miraculously, implementing range requests themselves doesn't seem to be necessary because ... the test passes. It's already built-in to the version of Flask we're using, apparently.

And just for some background, here's the download submission route:

@api.route("/sources/<source_uuid>/submissions/<submission_uuid>/download", methods=["GET"])
def download_submission(source_uuid: str, submission_uuid: str) -> flask.Response:
    get_or_404(Source, source_uuid, column=Source.uuid)
    submission = get_or_404(Submission, submission_uuid, column=Submission.uuid)
    return utils.serve_file_with_etag(submission)

This responds with the file in utils.serve_file_with_etag. Here's that function:

def serve_file_with_etag(db_obj: Union[Reply, Submission]) -> flask.Response:
    file_path = Storage.get_default().path(db_obj.source.filesystem_id, db_obj.filename)
    response = send_file(
        file_path, mimetype="application/pgp-encrypted", as_attachment=True, etag=False
    )  # Disable Flask default ETag

    if not db_obj.checksum:
        add_checksum_for_file(db.session, db_obj, file_path)

    response.direct_passthrough = False
    response.headers["Etag"] = db_obj.checksum
    response.headers["Accept-Ranges"] = "bytes"
    return response

All the API is doing is using Flask's send_file function to load the encrypted file from disk and send it to the http client. There's no compression or anything else happening here, so range support should be easy.

Testing

You can confirm it works by running the test and seeing it pass:

./securedrop/bin/dev-shell bin/run-test --no-cov tests/test_journalist_api.py::test_download_submission_range

But I didn't quite believe it, so I manually tested it too like this:

To test using curl, first authenticate to the API, of course changing one_time_code:

curl \
-X POST \
-H "Content-Type: application/json" \
-d '{"username": "journalist", "passphrase": "correct horse battery staple profanity oil chewy", "one_time_code": "887767"}' \
http://localhost:8081/api/v1/token

The response will be something like this:

{
  "expiration": "2024-05-10T00:57:42.102593Z", 
  "journalist_first_name": null, 
  "journalist_last_name": null, 
  "journalist_uuid": "54688953-a6a4-407b-b916-e64a4d3cf849", 
  "token": "Im83ZGtPLWFUd3gxM1ZtOE1OMEdWcFZEeGlEQVRUWUQ1TjJ5WnNZTTA5NGci.Zj1U5Q.8zYmrlSpjzNX63SVsZZhOeJ8Sq0"
}

Set the token for subsequent requests, changing the token to whatever you got:

TOKEN=Im83ZGtPLWFUd3gxM1ZtOE1OMEdWcFZEeGlEQVRUWUQ1TjJ5WnNZTTA5NGci.Zj1U5Q.8zYmrlSpjzNX63SVsZZhOeJ8Sq0

Get the download URL for the last source's only submission (this uses jq so you might need to install it):

SUBMISSION_URL=$(curl -H "Authorization: Token $TOKEN" http://localhost:8081/api/v1/sources | jq -r '.sources[-1].submissions_url')
DOWNLOAD_URL=$(curl -H "Authorization: Token $TOKEN" http://localhost:8081$SUBMISSION_URL | jq -r '.submissions[0].download_url')

Download it:

curl -v \
-H "Authorization: Token $TOKEN" \
-o logo.png.gpg \
http://localhost:8081$DOWNLOAD_URL

Now download just first 100 bytes:

curl -v \
-H "Authorization: Token $TOKEN" \
-H "Range: bytes=0-99" \
-o logo.png.gpg.part-0-99 \
http://localhost:8081$DOWNLOAD_URL

Now grab the first 100 bytes from the full file:

dd if=logo.png.gpg of=logo.png.gpg.dd-0-99 bs=1 count=100

And compare:

$ ls -l logo.png.gpg.*
-rw-r--r-- 1 user user 100 May  9 16:11 logo.png.gpg.dd-0-99
-rw-r--r-- 1 user user 100 May  9 16:11 logo.png.gpg.part-0-99
$ sha256sum logo.png.gpg.*
b98d6f101320b7fbc88d2fb59c03762f847763654c136df81017ffe177173181  logo.png.gpg.dd-0-99
b98d6f101320b7fbc88d2fb59c03762f847763654c136df81017ffe177173181  logo.png.gpg.part-0-99

@micahflee micahflee marked this pull request as ready for review May 10, 2024 16:07
@micahflee micahflee requested a review from a team as a code owner May 10, 2024 16:07
@micahflee
Copy link
Contributor Author

I'm not sure why the CircleCI lint and deb-tests are failing, so any help on figuring that out is appreciated.

@zenmonkeykstop
Copy link
Contributor

I'm not sure why the CircleCI lint and deb-tests are failing, so any help on figuring that out is appreciated.

It's not anything you've changed, probably an updated runner on their side. Bumped the docker api version in the circleci config to see if that clears it.

Copy link
Contributor

@zenmonkeykstop zenmonkeykstop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan checks out! Easy win for sure.

@zenmonkeykstop zenmonkeykstop merged commit b48231d into develop May 16, 2024
17 checks passed
@zenmonkeykstop zenmonkeykstop deleted the 3907-range-requests branch May 16, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Support range requests (RFC 7233) on journalist interface
2 participants