Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CommandException: No URLs matched when passing URLs to rm from stdin #490

Open
ipince opened this issue Dec 21, 2017 · 17 comments
Open

CommandException: No URLs matched when passing URLs to rm from stdin #490

ipince opened this issue Dec 21, 2017 · 17 comments

Comments

@ipince
Copy link

ipince commented Dec 21, 2017

I'm trying to use gsutil rm -I and pass a list of URLs to delete through stdin.

For an existing directory in, say, gs://test-bucket/test-dir, these are some commands I've tried:

# verify directory exists
$ gsutil ls -d gs://test-bucket/test-dir
gs://test-bucket/test-dir/

$ echo "gs://test-bucket/test-dir" | gsutil -m rm -r -I
CommandException: No URLs matched

$ echo gs://test-bucket/test-dir | gsutil -m rm -r -I
CommandException: No URLs matched

$ gsutil -m rm -r -I <<< "gs://test-bucket/test-dir"
CommandException: No URLs matched

$ gsutil ls -d gs://test-bucket/test-dir | gsutil -m rm -r -I
CommandException: No URLs matched

Am I missing something here?

@houglum
Copy link
Collaborator

houglum commented Jan 2, 2018

Nope, I don't think you're missing anything -- I can reproduce this as well. Running the gsutil rm command above with the -DD flag shows that gsutil isn't even making an API call to check for the object in question.

Looking in name_expansion.py, we're creating a PluralityCheckableIterator, which wraps a NameExpansionIterator, which wraps another PluralityCheckableIterator object that wraps the generator that's supposed to read lines from stdin (phew). Anyway, I threw a few debugging print statements into the _PopulateHead() method in plurality_checkable_iterator.py, and found that the underlying generator is throwing a StopIteration exception. Not quite sure why yet -- I'll continue to investigate soon.

Notes to self:

  • Go back and do a binary search through recent commits and see at what point this started happening.
  • Why didn't tests catch this?
  • This same approach works for the cp command, but doesn't work for rm.

@blixt
Copy link

blixt commented Feb 8, 2018

I ran into this issue too today (gsutil version 4.28). Are there any known workarounds?

@houglum
Copy link
Collaborator

houglum commented Feb 10, 2018

Off the top of my head, I can only think of one: You could write a thin wrapper script to pass the arguments to gsutil yourself (using something like xargs), making sure to run gsutil invocations for batches of <some value less than your shell's $ARG_MAX value> objects.

@GoogleCloudPlatform GoogleCloudPlatform deleted a comment from 1tron1 Feb 16, 2018
@mystredesign
Copy link

I'm seeing the same thing too...
CommandException: No URLs matched

@hardikmodha
Copy link

I'm seeing the same exception while copying a tar file to my bucket. CommandException: No URLs matched

@yaseenkhanmohmand
Copy link

Seeing the same problem. any suggested solutions?

@ramaniak
Copy link

ramaniak commented Feb 8, 2019

not sure if there is a resolution:
I am running --
gsutil version: 4.34

and get this

CommandException: Destination URL must name a directory, bucket, or bucket
subdirectory for the multiple source form of the cp command.

when trying to execute:

gsutil cp -n -R gs://hail-common/vep/vep/GRCh37/loftee_data /vep/loftee_data_grch37

Please let me know if I should post this elsewhere.
thanks

@ElnazBigdeli
Copy link

I had the same problem, I have changed the name of the file and it worked :).

@tal-franji-immunai
Copy link

Any news on this one?

@mvxt
Copy link

mvxt commented Feb 18, 2021

Same issue, on gsutil v4.59. Trying to remove the bucket and getting the same error even though the bucket clearly exists when looking at it on the console.

@dilipped
Copy link
Collaborator

Sorry for the delay in response. Our team is currently occupied with other priorities and does not have the bandwidth to address this issue at the moment. However, I did some investigation for future reference.

This seems to be happening because the url_strs gets iterated twice, once here

if self.recursion_requested:
if recursion is requested, and next it gets passed to the NameExpansionIterator
url_strs,

So essentially, we are trying to iterate over the iterator twice and hence on the second instance, we get an empty iterator.

The easy fix would be to convert the iterator to a list, i.e changing

url_strs = StdinIterator()
to

url_strs = [url for url in StdinIterator()]

But this can affect users who have really long list coming from stdin or users who are already using this feature in a pipeline and not really using the -r with -I. Note that this will only affect your if you are using -r and -I together.

The ideal fix would be to remove the recursion special case and instead handle the bucket deletion based on the NameExpansionIterator result itself.

A workaround would be something that is suggested here #490 (comment)

Alternatively, you can avoid using recursion (-r option) and pass in the list

gsutil ls gs://my_bucket/** | gsutil -m rm -I

Note that the above command will empty the bucket, but will not remove the bucket and you will have to run a separate command to remove it.

@lghasemzadeh
Copy link

lghasemzadeh commented Nov 2, 2021

I had the same problem, I have changed the name of the file and it worked :).

Hello @ElnazBigdeli,
I want to upload an image from my local to the google cloud, and I get the same error.
You rename the file you wanted to upload or the bucket name or client secrets file?
It seems there are some rules to name files for google cloud, do you have any link regarding that?
Thank you.

@yanislavzagorov
Copy link

Any fix for this issue yet? A bit odd that this has been a known issue since (at least) 2018.

@urmich
Copy link

urmich commented Nov 23, 2021

Having the same issue. Having a workaround at least would be nice...

@ZdsAlpha
Copy link

ZdsAlpha commented Sep 6, 2022

Having same issue. Its not yet resolved???

@Vladmel1234
Copy link

same issue with gsutil cat , gsutil version 5.11

@Rylab
Copy link

Rylab commented May 25, 2023

Just chiming in that this is still an issue for me, and still quite annoying when dealing with programmatically modifying even moderate size data sets in gcloud gs buckets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests