Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RequestRangeNotSatisfiable when using a blob's file-handler with csv reader #403

Closed
bgirschig opened this issue Mar 31, 2021 · 3 comments
Closed
Labels
api: storage Issues related to the googleapis/python-storage API. type: question Request for information or clarification. Not an issue.

Comments

@bgirschig
Copy link

Environment details

  • OS: 5.7.17-1rodete5-amd64
  • Python: 3.8.7
  • pip: 20.1.1
  • GCS: 1.37.0

Steps to reproduce

  1. Open a blob for reading
  2. Use it with python's built in csv reader
  3. A RequestRangeNotSatisfiable exception is thrown

Code example

from os import path
import csv
from io import StringIO
from google.cloud import storage
import sys

print(f"gcs client version: {storage.__version__}")
print(f"python version: {sys.version}")

gcs = storage.Client()
blob = gcs.get_bucket("my_bocket").get_blob("path/to/my_csv_file.csv")

print(f"blob exists: {blob.exists()}")

with blob.open() as f:
  try:
    data = [row for row in csv.reader(f)]
    print(f"first method worked: {len(data)} csv lines loaded")
  except Exception as e:
    print(f"first method failed: {type(e).__name__}")

with blob.open() as f:
  try:
    text_data = f.read()
    buffer = StringIO(text_data)
    data = [row for row in csv.reader(buffer)]
    print(f"second method worked: {len(data)} csv lines loaded")
  except Exception as e:
    print(f"second method failed: {type(e).__name__}")

# ======================= output =======================
# gcs client version: 1.37.0
# python version: 3.8.7 (default, Dec 22 2020, 10:37:26) 
# [GCC 10.2.0]
# blob exists: True
# first method failed: RequestRangeNotSatisfiable
# second method worked: 83521 csv lines loaded

Stack trace

Traceback (most recent call last):
  File "bug_report.py", line 17, in <module>
    data = [row for row in csv.reader(f)]
  File "bug_report.py", line 17, in <listcomp>
    data = [row for row in csv.reader(f)]
  File "/home/bgirschig/Documents/projects/draw_to_art/drawtoart_clean/env/lib/python3.8/site-packages/google/cloud/storage/fileio.py", line 112, in read1
    return self.read(size)
  File "/home/bgirschig/Documents/projects/draw_to_art/drawtoart_clean/env/lib/python3.8/site-packages/google/cloud/storage/fileio.py", line 96, in read
    result += self._blob.download_as_bytes(
  File "/home/bgirschig/Documents/projects/draw_to_art/drawtoart_clean/env/lib/python3.8/site-packages/google/cloud/storage/blob.py", line 1296, in download_as_bytes
    client.download_blob_to_file(
  File "/home/bgirschig/Documents/projects/draw_to_art/drawtoart_clean/env/lib/python3.8/site-packages/google/cloud/storage/client.py", line 731, in download_blob_to_file
    _raise_from_invalid_response(exc)
  File "/home/bgirschig/Documents/projects/draw_to_art/drawtoart_clean/env/lib/python3.8/site-packages/google/cloud/storage/blob.py", line 4061, in _raise_from_invalid_response
    raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.RequestRangeNotSatisfiable: 416 GET https://storage.googleapis.com/download/storage/v1/b/cilex-drawtoart/o/datasets%2F83.5K%2Fids.csv?generation=1617032864116691&alt=media: Request range not satisfiable: ('Request failed with status code', 416, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

Thanks!

@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label Mar 31, 2021
@shaffeeullah shaffeeullah added the type: question Request for information or clarification. Not an issue. label Mar 31, 2021
@andrewsg
Copy link
Contributor

andrewsg commented Apr 2, 2021

We've found that this happens when read attempts to request bytes from beyond the end of the file. It should be fixed in #400 (already merged) which will be released on Monday. Thanks a lot for the detailed report.

@andrewsg andrewsg closed this as completed Apr 2, 2021
@bgirschig
Copy link
Author

That's great news, thanks!

@andrewsg
Copy link
Contributor

andrewsg commented Apr 5, 2021

This has been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants