Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chapter 04: df07.py - Unable to open file: gs://BUCKETNAME/flights/staging/ch04timecorr.1656567385.996847/pipeline.pb. #151

Open
Acturio opened this issue Jun 30, 2022 · 3 comments

Comments

@Acturio
Copy link

Acturio commented Jun 30, 2022

Hi! i have the next log when i try to run df07.py.

./df07.py --project PROJECT --bucket BUCKETNAME --region us-central1
Correcting timestamps and writing to BigQuery dataset
/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py:2527: BeamDeprecationWarning: options is deprecated since First stable release. References to .options will not be supported
temp_location = pcoll.pipeline.options.view_as(
/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery_file_loads.py:1129: BeamDeprecationWarning: options is deprecated since First stable release. References to .options willnot be supported
temp_location = p.options.view_as(GoogleCloudOptions).temp_location
warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md

ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs//2022-06-29_22_36_30-1790320629162913076?project=
Traceback (most recent call last):
File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 202, in
run(project=args['project'], bucket=args['bucket'], region=args['region'])
File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 177, in run
(events
File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/pipeline.py", line 598, in exit
self.result.wait_until_finish()
File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1673, in wait_until_finish
raise DataflowRuntimeException(
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Unable to open file: gs://BUCKETNAME/flights/staging/ch04timecorr.1656567385.996847/pipeline.pb.

Any suggest will be appreciated. Thank you

@lakshmanok
Copy link
Contributor

lakshmanok commented Jun 30, 2022 via email

@Acturio
Copy link
Author

Acturio commented Jun 30, 2022

i appreciate the quick response. This is the last log:

act_arturo_b@cloudshell:~/data-science-on-gcp/04_streaming/transform (ds-on-gcp-353305)$ ./df07.py --project ds-on-gcp-353305 --bucket ${BUCKETNAME} --region us-central1
Correcting timestamps and writing to BigQuery dataset
/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery.py:2527: BeamDeprecationWarning: options is deprecated since First stable release. References to .options will not be supported
temp_location = pcoll.pipeline.options.view_as(
/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/io/gcp/bigquery_file_loads.py:1129: BeamDeprecationWarning: options is deprecated since First stable release. References to .options will not be supported
temp_location = p.options.view_as(GoogleCloudOptions).temp_location
warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md

ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs//2022-06-29_23_27_32-11374214288357084698?project=
Traceback (most recent call last):
File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 202, in
run(project=args['project'], bucket=args['bucket'], region=args['region'])
File "/home/act_arturo_b/data-science-on-gcp/04_streaming/transform/./df07.py", line 177, in run
(events
File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/pipeline.py", line 598, in exit
self.result.wait_until_finish()
File "/home/act_arturo_b/.local/lib/python3.9/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1673, in wait_until_finish
raise DataflowRuntimeException(
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Unable to open file: gs://ds-on-gcp-353305-dsongcp/flights/staging/ch04timecorr.1656570447.957722/pipeline.pb.

the problem is the same.

any help will be appreciate.

@lakshmanok
Copy link
Contributor

lakshmanok commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants