Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataflow bucket creation #3216

Open
elharo opened this issue Jul 9, 2018 · 4 comments
Open

Dataflow bucket creation #3216

elharo opened this issue Jul 9, 2018 · 4 comments

Comments

@elharo
Copy link
Contributor

elharo commented Jul 9, 2018

Problem using newly created bucket during test plan. Investigating...

Exception in thread "main" java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
	at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
	at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
	at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
	at org.apache.beam.sdk.Pipeline.create(Pipeline.java:150)
	at com.example.StarterPipeline.main(StarterPipeline.java:50)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
	... 4 more
Caused by: java.lang.IllegalArgumentException: Missing object or bucket in path: 'gs://bar45679/', did you mean: 'gs://some-bucket/bar45679'?
	at org.apache.beam.repackaged.beam_sdks_java_extensions_google_cloud_platform_core.com.google.common.base.Preconditions.checkArgument(Preconditions.java:383)
	at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPath(GcsPathValidator.java:77)
	at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.validateOutputFilePrefixSupported(GcsPathValidator.java:60)
	at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:245)
	... 9 more
@elharo elharo self-assigned this Jul 9, 2018
@elharo
Copy link
Contributor Author

elharo commented Jul 9, 2018

wondering if something changed recently in bucket validation:

  @Override
  public String verifyPath(String path) {
    GcsPath gcsPath = getGcsPath(path);
    checkArgument(gcsPath.isAbsolute(), "Must provide absolute paths for Dataflow");
    checkArgument(!gcsPath.getObject().isEmpty(),
        "Missing object or bucket in path: '%s', did you mean: 'gs://some-bucket/%s'?",
        gcsPath, gcsPath.getBucket());
    checkArgument(!gcsPath.getObject().contains("//"),
        "Dataflow Service does not allow objects with consecutive slashes");
    return gcsPath.toResourceName();
  }

@elharo
Copy link
Contributor Author

elharo commented Jul 9, 2018

Nope, that hasn't changed in a while. Maybe GcsPath?

@chanseokoh
Copy link
Contributor

You should always specify a subfolder: gs://bucket/some-folder/ There was an issue in our repo closed as WAI.

@elharo
Copy link
Contributor Author

elharo commented Jul 9, 2018

OK, the UI here is confusing then. We should disable the run button and probably put up an error decorator or message until the user has typed in a full path.

@elharo elharo removed their assignment Jul 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants