Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: add backblaze b2 storage backend #512

Closed
ibotty opened this issue May 9, 2016 · 32 comments
Closed

feature request: add backblaze b2 storage backend #512

ibotty opened this issue May 9, 2016 · 32 comments

Comments

@ibotty
Copy link

ibotty commented May 9, 2016

See https://www.backblaze.com/b2/docs and the go client/bindings at https://github.com/kothar/go-backblaze

@fd0
Copy link
Member

fd0 commented May 9, 2016

Thanks for the suggestion.

@kurin
Copy link
Contributor

kurin commented May 16, 2016

I'd like to plug my B2 lib, which implements higher-level abstractions and error correction: https://github.com/kurin/blazer

In particular, it automatically switches to B2's "large file" API when the file is larger than a certain threshold, allowing uploads of > 5GB.

@fd0
Copy link
Member

fd0 commented May 17, 2016

Thanks for the hint. It seems you have some experience with backblaze, do they have a test service we can run our CI tests against?

How do you test your library? Against the production systems with valid credentials?

@rubenv
Copy link
Contributor

rubenv commented May 17, 2016

Should be noted that while B2 is dirt-cheap in terms of storage, it isn't per se in terms of retrieval and operations. Doing large frequent backups with restic's current access pattern (where it pulls the indexes/trees from the repo) might be costly.

Off-course none of that will matter once there's a local cache for that data. In that case B2 might just be the most attractive option available for remote storage.

@jayme-github
Copy link

https://www.backblaze.com/b2/cloud-storage-pricing.html says 1GB egress is free per 24 hrs. Shouldn't that be plenty enough for the index data?

@kurin
Copy link
Contributor

kurin commented May 17, 2016

Backblaze doesn't have a test instance. I'm implementing one myself to test my lib against, since my integration tests have some limits they can't reasonably reach (e.g. credential auth times out after 24 hours). I'll probably get that mostly feature complete in the next few weeks.

@arithmetric
Copy link

I was interested in this feature too, and put together PR #694 to implement B2 backend support using @kurin's blazer package. Let me know in the PR if you have any feedback.

@fd0
Copy link
Member

fd0 commented Mar 14, 2017

A potential b2 backend still needs some work and we need to find a strategy for testing. I've created an account with Backblaze and submitted a request for a sponsored account we can use to run the tests.

For using this account to even access the free tier of b2, it will need a phone number to activate two factor authentication, and I'm not (yet) willing to give them my phone number. ;)

Let's continue the discussion here.

@kurin
Copy link
Contributor

kurin commented Mar 14, 2017

fwiw the blazer integration tests are on the order of several hundred MB per test (since I need to push >100MB to b2 to test all the complex stuff) and I don't really notice much in my bill. You can also set hard caps on requests, per RPC type, so that subsequent requests fail. The cap is daily.

@fd0
Copy link
Member

fd0 commented Mar 15, 2017

Thanks for the information @kurin.

@fd0
Copy link
Member

fd0 commented Mar 16, 2017

FYI: Backblaze won't sponsor a free test account. Maybe I'll experiment a bit with their free tier...

@mholt
Copy link
Contributor

mholt commented May 5, 2017

How is the free test account needed when 1 GB of download per day is free? (Sorry, I must have missed it in the comments above here?)

@fd0
Copy link
Member

fd0 commented May 6, 2017

I don't think it's necessary any more. I'm currently reworking the backend CI tests so that they use much less data than before (at least for the backends that are tested against a live third-party service). So we can probably just use the free tier. I need to find time to look into this, it's on my list (right after the swift backend).

@fd0
Copy link
Member

fd0 commented May 28, 2017

I've added a PR with a B2 backend based on @kurin's lib: #978

@fd0
Copy link
Member

fd0 commented May 29, 2017

Anybody interested in testing it? I'd love to have some feedback!

@fd0 fd0 closed this as completed in #978 Jun 3, 2017
@Phaeilo
Copy link
Contributor

Phaeilo commented Jun 4, 2017

I just uploaded 14.738 GiB (51459 items) to B2. This took about 90 minutes. According to the Backblaze "Reports" page this caused about 2700 b2_get_upload_url, 2700 b2_upload_file and 2700 b2_list_file_names API calls. Any chance the number of b2_list_file_names API calls can be optimized to reduce cost?

Edit: Creating a second snapshot of the same, unchanged directory took 22 minutes. It resulted in about 8500 b2_download_file_by_name calls and an additional 8600 b2_list_file_names calls.

Edit2: Created a third snapshot just now using --force. It took 4.5 minutes to complete. The API usage counters of B2 were only increased by 10 to 50 counts for various functions.

@wpbrown
Copy link

wpbrown commented Jun 5, 2017

I just uploaded 2 different repos in different buckets. The bucket are 29.5GB and the other 47.5MB. The upload calls actually look about right. The high number of downloads are odd because I've never tried to restore. The 6K list file names is something very bad going on.

stats

I tried creating a second snapshot of the small repo with just a couple text files changed and it generated a lot of activity. I ran it again, this time with the debug log and no changes to the repo...

The source dir has 866 files and 47MB. After 4 snapshots this bucket/repo only contains 21 files. It should only have needed to call b2_list_file_names once. And I'm still learning about restic, but I'm not sure why it would need to download more than the index? It downloaded from all the data files numerous times.

Here is a grep for X-Blazer-Method in the log for the backup run where there were no changes:
X-Blazer-Method: b2_authorize_account
X-Blazer-Method: b2_list_buckets
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_get_file_info
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_get_upload_url
X-Blazer-Method: b2_upload_file
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_get_upload_url
X-Blazer-Method: b2_upload_file
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_get_upload_url
X-Blazer-Method: b2_upload_file
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_download_file_by_name
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_get_upload_url
X-Blazer-Method: b2_upload_file
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_get_upload_url
X-Blazer-Method: b2_upload_file
X-Blazer-Method: b2_list_file_names
X-Blazer-Method: b2_delete_file_version

@kurin
Copy link
Contributor

kurin commented Jun 5, 2017

In blazer, downloading an object whether in whole or in part will call b2_list_file_names first to populate an internal object with server-side data, akin to stat. I don't remember offhand but I suspect this is to get, among other things, the object size, in order to support ranged reads.

I'll explore whether this check can be elided.

@kurin
Copy link
Contributor

kurin commented Jun 5, 2017

I've pushed a change to blazer that removes the call to b2_list_file_names on every (new) file read; update and see if this helps.

@fd0
Copy link
Member

fd0 commented Jun 5, 2017

Oh, awesome! I'll prepare a PR :)

@fd0
Copy link
Member

fd0 commented Jun 5, 2017

See #997

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

I've had a chance to test this out; it looks more reasonable on downloads; for a ~250MB restore I get:

$ grep X-Blazer-Method /tmp/restic-debug.log  | sort | uniq -c
      1 X-Blazer-Method: b2_authorize_account
      2 X-Blazer-Method: b2_delete_file_version
   8166 X-Blazer-Method: b2_download_file_by_name
      1 X-Blazer-Method: b2_get_file_info
      2 X-Blazer-Method: b2_get_upload_url
      1 X-Blazer-Method: b2_list_buckets
     10 X-Blazer-Method: b2_list_file_names
      2 X-Blazer-Method: b2_upload_file

of which only a handful are extraneous:

$ grep '416 Requested' /tmp/restic-debug.log | wc -l
9

It looks like these 8k reads come from ~60 data files. I don't know the internals of restic, but I bet any improvement in the number of download_file_by_name calls will have to come from some kind of prefetch logic that can read multiple chunks at once.

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

Uploads are still going to be 1:1 with list_file_names and the upload calls, because restic calls Attrs for each b2 object before saving to ensure it doesn't exist. I don't see a way around this check, but I can skip this and just call get_file_info, which is class C instead of class B and 10% as expensive.

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

Okay I remember why I did it this way. b2_get_file_info, the call we want to make, requires the (B2 internal) fileId, which we can't derive from the name except by making another API call. The docs say:

The ID of the file, as returned by b2_upload_file, b2_hide_file, b2_list_file_names, or b2_list_file_versions.

@fd0
Copy link
Member

fd0 commented Jun 6, 2017

I'll think about dropping the Attrs call for saving a new file in certain cases (e.g. for data files), but that'll take some time. The B2 backend will benefit automatically from it.

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

I got a workaround in by downloading a single byte :/

download_file_by_name is cheaper than list_file_names.

I'm testing it now but it should work.

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

$ grep X-Blazer-Method /tmp/restic-debug.log  | sort | uniq -c
      1 X-Blazer-Method: b2_authorize_account
      1 X-Blazer-Method: b2_delete_file_version
     73 X-Blazer-Method: b2_download_file_by_name
      1 X-Blazer-Method: b2_get_file_info
     63 X-Blazer-Method: b2_get_upload_url
      1 X-Blazer-Method: b2_list_buckets
      6 X-Blazer-Method: b2_list_file_names
     63 X-Blazer-Method: b2_upload_file

@fd0
Copy link
Member

fd0 commented Jun 6, 2017

Hmhm, that's ugly. 😝

Is there maybe a way to upload a file on B2 with O_EXCL, i.e. the call fails if the file already exists?

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

Nope. New uploads "hide" older uploads. If you upload "foo" to "x", and then "bar" to "x", and then read "x" you'll get "bar". If you then delete "x", and read "x", you'll get "foo".

@fd0
Copy link
Member

fd0 commented Jun 6, 2017

Hm, with the restic repository that'll work out just fine. I'd just need to add special code for removing a file, so that it removes all versions. I'll think about it.

Can we move this to a new issue? Discussing it in the PR feels wrong.

@kurin
Copy link
Contributor

kurin commented Jun 6, 2017

Sure.

@fd0 fd0 mentioned this issue Jun 6, 2017
@fd0
Copy link
Member

fd0 commented Jun 6, 2017

So, the discussion is moved to #1000 (yay)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants