Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish implementation of Spaces API #140

Open
21 of 27 tasks
amoeba opened this issue Oct 18, 2017 · 57 comments
Open
21 of 27 tasks

Finish implementation of Spaces API #140

amoeba opened this issue Oct 18, 2017 · 57 comments
Labels

Comments

@amoeba
Copy link
Contributor

amoeba commented Oct 18, 2017

#138 added support for part of the Spaces API and we decided in #136 to start a fresh issue to help divvy up the tasks of completing the API. Below is a checklist with items transcribed from the Spaces API docs:

  • Bucket Operations
    • Create a Bucket
    • List All Buckets
    • List a Bucket's Contents
    • Delete a Bucket
    • Get a Bucket's Location
    • Get a Bucket's ACL
    • Set a Bucket's ACL
    • Get a Bucket's CORS
    • Set a Bucket's CORS
    • Delete a Bucket's CORS (new)
    • Get a Bucket's Lifecycle Rules (new)
    • Configure a Bucket's Lifecycle Rules (new)
    • Delete a Bucket's Lifecycle Rules (new)
  • Object Operations
    • Get an Object
    • Get Information About an Object
    • Upload an Object (PUT)
    • Copy an Object
    • Get an Object's ACLs
    • Set an Object ACLs
    • Delete an Object
    • Begin a Multi-part Upload
    • Upload a Part
    • List Parts
    • Complete a Multi-part Upload
    • Cancel a Multi-part Upload

Let me know if the organization of this list doesn't make sense or isn't useful for tackling this.

@amoeba
Copy link
Contributor Author

amoeba commented Oct 23, 2017

@sckott I more or less copied the list from the API docs. Does having analogsea implement all of these seem appropriate?

@Thercast Did you have any preference for which methods you wanted to tackle first? I should be able to take a look at this some evening this week and actually be helpful.

@sckott
Copy link
Collaborator

sckott commented Oct 23, 2017

I think it makes sense to include all of them

@amoeba
Copy link
Contributor Author

amoeba commented Nov 6, 2017

I sat down to get a lay of this and everything looks very reasonable. I wrote a get_object function to match the Spaces API method GET ${BUCKET}.${REGION}.digitaloceanspaces.com/${OBJECT_KEY} using aws.s3::get_object and immediately noticed that aws.s3 didn't reasonably handle error cases.

See this example with an invalid bucket name, purposely run:

> rawToChar(space_get_object("welcome.html", "test-analogseas"))
[1] "<?xml version=\"1.0\" encoding=\"UTF-8\"?><Error><Code>NoSuchBucket</Code><BucketName>test-analogseas</BucketName><RequestId>tx000000000000002f5f59e-0059ffb7c0-3f970-nyc3a</RequestId><HostId>3f970-nyc3a-nyc</HostId></Error>"

It returns the raw response and throws away the underlying HTTP request's status code so there appears to be no way to know if this is the file I have stored at welcome.html or an error.

Looking at the source I can see why. Looking at the open Issues on the repo, I see that the function isn't really written to handle errors see this issue.

If it's the case that the aws.s3 package doesn't handle errors, I wonder if we (1) care and (2) would rather try to stay with using aws.s3 or use crul to get proper error handling.

Thoughts?

@rpodcast
Copy link
Contributor

rpodcast commented Nov 7, 2017

@amoeba thanks for your work on this! I've been AFK due to illness but I want to jump back in to help out. I'll add notes on which features I would like to tackle this week.

Regarding the error handling with aws.s3: I struggled with how to handle this in my first implementation. We have a lot of functionality already built for us within aws.s3, and having that convenience for this early implementation of spaces is hard to pass up. But showing just the raw response when the request errors out is a real eye sore compared to how the other analogsea functions elegantly handle errors via crul. From my perspective, I could see the long term path being a re-write of the key functionality using crul to be consistent with the rest of the package and having one less dependency, but I will defer to @sckott for his take.

@sckott
Copy link
Collaborator

sckott commented Nov 7, 2017

Thanks for raising the issue with erroring. I definitely value failing well, so I'd strongly prefer to make sure users get useful errors - and DigitalOcean, at least with the Droplet API (and I assume with the spaces API) fails well and correctly making it easy to give users appropriate errors

We currently still use httr in this package. Main thing keeping httr around is that it can do oauth whereas crul cannot, but i think an argument can be made to get rid of oauth if we think most people don't use it but instead use PATs

For now I think it's okay to stick with aws.s3, but eventually i think we should rework to make sure spaces fxns error well

@amoeba
Copy link
Contributor Author

amoeba commented Nov 7, 2017

Sounds good, thanks for weighing in @sckott. If/when the aws.s3 package changes how they handle error cases, this package benefits automatically, so I'm down with filling in this portion of the package with aws.s3 as much as possible.

@Thercast Sorry to hear you were out! Looking forward to your notes.

@sckott sckott added the spaces label Nov 11, 2017
@sckott
Copy link
Collaborator

sckott commented Nov 30, 2017

any thoughts on this? anything I can help with?

@amoeba
Copy link
Contributor Author

amoeba commented Dec 4, 2017

I'm still down to help PR this addition but wanted to let @Thercast take lead since it was his idea. Hopefully he'll pop in here and we can get this done soon?

@sckott
Copy link
Collaborator

sckott commented Dec 18, 2017

@Thercast any thoughts? if you're too busy, anything @amoeba or I can move forward on?

@rpodcast
Copy link
Contributor

Hi @sckott and @amoeba, sorry for the delay (an urgent shiny app came up at work and then the kids got sick) but I would like to tackle the listing of bucket contents and deletion of a bucket (these tasks should help shake off the cobwebs of being away for a bit). @amoeba I'd be glad if you'd like to start on the fundamental operations for objects in a space (i.e. downloading, uploading, deleting). Does that sound like a good plan?

@amoeba
Copy link
Contributor Author

amoeba commented Dec 27, 2017

Sounds good to me!

@amoeba
Copy link
Contributor Author

amoeba commented Jan 9, 2018

First half of the Object calls implemented in #146. Went pretty well. Regarding the calls I didn't implement,

  • Delete an Object

    I don't know why my code isn't working (yet). It looks like I'm seeing an issue with passing of the base_url argument which is only affecting this function. It looks like it should work just fine.

  • Multi-part operations

    aws.s3 implements what appears to be the same multi-part uploading API that Spaces does but they do it as an argument multipart on put_object which handles the Begin/Upload/Complete calls. Should each of those Spaces API calls to have their own functions or should we just calls this done?

General questions:

  • Function names: What do we think?
  • Testing: Do we want mocks here so we have at least some basic regression tests in place? I didn't se mocks in other places so skipped this for now
  • Docs: I tried to make them fit stylistically. Thoughts?
  • Should some of these calls get split into one or more files?

I'll look into the delete operation issue above later this week.

@sckott
Copy link
Collaborator

sckott commented Jan 9, 2018

Regarding the PR, I assume you're adding more to it? Or do you want it to be handled as is?

  • do you have thoughts on function names? I think it would be better to change the structure to e.g. spaces_thing_verb, so for example change spaces_get_object to spaces_object_get so that when users are searching through function names (e.g., tab completion to find functions) you get all the functions for spaces_object*, or for spaces_acl*
  • tests: yeah, tests are sorely lacking here. been meaning to 1) add a few live tests of droplet creation/etc., but 2) mostly test against mocked/cached data. I'll open a new issue for this to discuss it there
  • docs:
    • the repeated parameters in each spaces fxn could go into man-roxygen/ as a template file to save lines
    • wrap booleans in \code{}
    • maybe for examples, e.g., for each man file, include entire workflow from creating a space to doing whatever the task is (e.g. spaces_get_object)
  • Yes, I think breaking up spaces fxns into many files is good

@amoeba
Copy link
Contributor Author

amoeba commented Jan 11, 2018

I should've made that more clear, sorry: I'm planning on adding more commits but needed the intermediate review (thanks!). Once I'm done, do you mind a messy PR or would you rather I squash the PR into a single commit once I'm happy?

Re: Your comments:

I think it would be better to change the structure to e.g. spaces_thing_verb

Sold!

tests

Sounds great.

docs:
the repeated parameters in each spaces fxn could go into man-roxygen/ as a template file to save lines

Great, this is new to me so I'll look that up and take a stab at it.

wrap booleans in \code{}

Good suggestion!

maybe for examples, ...

I like this a lot. Will do.

Yes, I think breaking up spaces fxns into many files is good

Great, will do.

@amoeba
Copy link
Contributor Author

amoeba commented Jan 11, 2018

Okay, figured out the Roxygen template functionality to DRY up the common @params.

  • What do you think about my above question about the multi-part functionality?
  • Since ACL's are returned as text but are XML documents, what do you think about detecting the xml2 package, and returning an xml_document if it's installed. If it's not installed, sending a message() telling the user they can install the xml2 package to get this functionality, and then returning the character response? Plus an arg or option to disable?

@amoeba
Copy link
Contributor Author

amoeba commented Jan 11, 2018

Figured out the problem with deleting objects. Totally just due to a bug in aws.s3, which I've Issue+PR'd cloudyr/aws.s3#189

@sckott
Copy link
Collaborator

sckott commented Jan 15, 2018

not sure on the multipart thing. to make separate functions, how many more fxns does that add?

for xml, like idea of putting xml2 in Suggests in DESCRIPTION file, then like you say we parse to an xml_document if installed and if not then just the character string.

@amoeba
Copy link
Contributor Author

amoeba commented Jan 15, 2018

not sure on the multipart thing. to make separate functions, how many more fxns does that add

It'd be five new functions which have a 1:1 mapping with their Spaces API calls. I think the main reason to do this is so we can claim this package implements the Spaces API with 1:1 mappings between API routes and R functions. I don't think this would be commonly used by this package's users.

For the user to upload a multi-part file, they would have to write their own routine to start, upload each piece, and complete the upload (which seems less-than-fun and error-prone). The functionality that would be provided by the five separate functions is almost entirely wrapped up in aws.s3::put_bucket(..., multipart=TRUE, ...) except for the "Cancel a Multi-part Upload" call.

I'd lobby for not doing these calls now and adding docs to spaces_object_put showing how to use the multipart argument. When/if this package switches to crul instead of aws.s3, that would be a good time to write these individual API calls out as functions.

@amoeba
Copy link
Contributor Author

amoeba commented Jan 16, 2018

Okay, xml2 support added to spaces_acl_put and spaces_acl_get in 19a98b0

If you're okay with not implementing the multi-part API with 1:1 functions, then the Object API is done.

@amoeba
Copy link
Contributor Author

amoeba commented Jan 16, 2018

Oh, and a note on how I did that: I noticed that aws.s3 Imports xml2 and, since analogsea Imports it too, xml2 is basically a hard-dependency for analogsea anyway. I left the requireNamespace guards in just in case analogsea ever moves away from Importing/using aws.s3.

@sckott
Copy link
Collaborator

sckott commented Jan 18, 2018

Agree to just put multipart as an argument.

Sounds good on the xml2 import thing.

then the Object API is done.

so PR #146 is ready to go from your side?

@amoeba
Copy link
Contributor Author

amoeba commented Jan 19, 2018

Almost. I decided to verify the aws.s3 multipart functionality actually works with Spaces and it doesn't seem to. I'll do some testing and get back to you. Otherwise, yes.

@amoeba
Copy link
Contributor Author

amoeba commented Jan 19, 2018

Found two blocking bugs in the aws.s3 package preventing me from implementing this package's multipart uploads with it. Might take a long time to get it patched upstream. Any preference on what to do?

@sckott
Copy link
Collaborator

sckott commented Jan 19, 2018

hugh - i guess move ahead here and add to docs that multipart doesn't work now, but will work later?

@amoeba
Copy link
Contributor Author

amoeba commented Jan 19, 2018

Okay, #146 is good to go! Thanks @sckott

@amoeba
Copy link
Contributor Author

amoeba commented Feb 8, 2018

Filed an upstream issue for Space creation not working cloudyr/aws.s3#199

@amoeba
Copy link
Contributor Author

amoeba commented Mar 19, 2018

aws.s3 is getting some fixes as we speak [1] so I'll plan to re-visit some of the existing issues with the Spaces implementation that are waiting on upstream changes this week.

[1] cloudyr/aws.s3#189

@amoeba
Copy link
Contributor Author

amoeba commented Apr 16, 2018

Sat down with this a bit yesterday and today and ran into more upstream issues. @leeper added support for non-AWS S3-compatible storages but I'm getting signature mismatch errors no matter what I try now: cloudyr/aws.s3#189. I think I'll wait to see what @leeper says in case this is a quick fix on either my end or aws.s3's.

@sckott
Copy link
Collaborator

sckott commented May 29, 2018

Okay, will also take a look soon

@amoeba
Copy link
Contributor Author

amoeba commented May 29, 2018

Thanks! I haven't been able to devote enough debugging time. I'm at a point where my next step is to take aws.s3 out of the equation to see if there's just an issue there or if aws.signature needs to be generalized.

@sckott
Copy link
Collaborator

sckott commented May 29, 2018

okay

@sckott
Copy link
Collaborator

sckott commented Jun 12, 2018

hey @amoeba - thinking about this a bit, I played around with trying to fix bucket creation, but go nowhere. Okay with you if we move on with next cran submission with what spaces stuff works for now? With functions that don't work yet, we could just not export them, or export and document as not working yet, and add a stop() inside them with message?

seems like at least these don't work for me:

  • space_create
  • spaces_object_delete

@amoeba
Copy link
Contributor Author

amoeba commented Jun 12, 2018

That sounds good to me. Thanks a million for trying to fix the issue.

Releasing a partial implementation of the Spaces API seems useful in its own right if those are the only two functions that aren't working so let's do that. I'll test to see which functions work to see if my list matches yours.

@sckott wrote:

With functions that don't work yet, we could just not export them, or export and document as not working yet, and add a stop() inside them with message?

Do you have a preference here? I'd lean towards not exporting but I'd like to do the least surprising thing.

I'll get to this tonight or tomorrow.

@sckott
Copy link
Collaborator

sckott commented Jun 12, 2018

not exporting is probably best, and good to document that they'll come along later

@amoeba
Copy link
Contributor Author

amoeba commented Jun 12, 2018

I get the same list of non-working functions. I'll start with the changes now.

Other notes that just came up:

  • When I ran this I got a non-invisible logical as a return
> spaces_object_put("~/Downloads/test.txt", space = "analogsea-test")
[1] TRUE

This function just returns exactly what aws.s3::put_object returns. When it fails, I think it always throws an exception so could I just wrap it in an invisible()?

  • spaces_object_copy does a similar thing:
> spaces_object_copy("test.txt", "test2.txt", "analogsea-test", "analogsea-test")
$LastModified
[1] "2018-06-12T04:45:49.357Z"

$ETag
[1] "00f7d50ab4278a7899d7499481c9603a"

attr(,"x-amz-request-id")
[1] "tx0000000000000076436a6-005b1f4ffd-8abd1d-nyc3a"
attr(,"content-type")
[1] "application/xml"
attr(,"content-length")
[1] "183"
attr(,"date")
[1] "Tue, 12 Jun 2018 04:45:49 GMT"
attr(,"strict-transport-security")
[1] "max-age=15552000; includeSubDomains; preload"

Would it also be better to invisble() this (and also eventually provide a proper class for the response?). This also applies to spaces_object_head(),

@amoeba
Copy link
Contributor Author

amoeba commented Jun 12, 2018

I just checked droplet_create() and it returns non-invisibly so maybe I should just follow that.

amoeba added a commit to amoeba/analogsea that referenced this issue Jun 12, 2018
As per pachadotdev#140:

- Remove export tags from space_create and spaces_object_delete
- Add notes to the roxygen docstrings
- Add `stops` with a hopefully helpful error message that the fns are not implemented yet
- Re-document package
@sckott
Copy link
Collaborator

sckott commented Jun 12, 2018

Cool, thanks for the PR. If nothing else on this for now. i'll push to cran in the next few days. i'll see if any problems pop up in checking before then

@amoeba
Copy link
Contributor Author

amoeba commented Jun 12, 2018

Great, I'm satisfied for now if you are. Thanks for all the help.

@sckott
Copy link
Collaborator

sckott commented Jul 10, 2018

still not pushed yet, sorry, getting there soon hopefully

@amoeba
Copy link
Contributor Author

amoeba commented Jul 10, 2018

NP!

@amoeba
Copy link
Contributor Author

amoeba commented Aug 4, 2018

Okay, so @thosjleeper worked some magic on aws.s3 and everything works now. PR incoming. The checklist at the top of this Issue is accurate AFAIK.

@sckott sckott modified the milestones: v0.7, v0.8 Mar 21, 2019
@sckott
Copy link
Collaborator

sckott commented Jan 15, 2020

@amoeba we need to push a new version soon (this milestone) - should we close this issue or keep it open and move to next milestone?

@amoeba
Copy link
Contributor Author

amoeba commented Jan 15, 2020

Hey @sckott, I'm cool keeping this open and finishing it up. I can make time for it before the month is over. Is that soon enough? I can also do #147 in the time frame.

@sckott
Copy link
Collaborator

sckott commented Jan 15, 2020

Sounds good, I think we should submit by 28 Jan - just in case need to resubmit a few times - don't want to risk missing 31 Jan deadline

@amoeba
Copy link
Contributor Author

amoeba commented Jan 15, 2020

Okay, I can do that.

@amoeba
Copy link
Contributor Author

amoeba commented Jan 27, 2020

I sat down for a few hours tonight to finish up the remaining Spaces functions and ended up getting stumped with what looks like more aws.s3 bugs. Note there are a few more Spaces methods added since the initial go around. I unfortunately don't have any more time before the end of the month to try wrapping things up (so that all Spaces functionality works).

After getting stumped, I spent a bit of time stripping out the aws.s3 parts and trying to make requests from scratch and haven't quite figured it all out, though I must be close.

What do you think makes the most sense as a next step? Sorry I haven't been able to devote more time.

@sckott
Copy link
Collaborator

sckott commented Jan 27, 2020

Thanks for the update. I think the additional Spaces work can wait until next version. I'd like to submit by Wed. Do you think the work of replacing aws.s3 can be done by then? If not, I can move the pkg to Suggests and rework so that it's used conditionally.

@sckott
Copy link
Collaborator

sckott commented Jan 27, 2020

oh right, you already did that work -> #189

@sckott
Copy link
Collaborator

sckott commented Jan 29, 2020

After getting stumped, I spent a bit of time stripping out the aws.s3 parts and trying to make requests from scratch and haven't quite figured it all out, though I must be close.

@amoeba do you think you can finish this today? or should I go ahead and submit?

@amoeba
Copy link
Contributor Author

amoeba commented Jan 29, 2020

No, sorry. Please go ahead and submit. I'll have to carve out some time this quarter.

@sckott
Copy link
Collaborator

sckott commented Jan 29, 2020

okay, will do

@sckott sckott removed this from the v0.8 milestone Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants