Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor suggestion: Migrating package from {httr} to {httr2} #13

Open
elipousson opened this issue Apr 11, 2023 · 6 comments
Open

Refactor suggestion: Migrating package from {httr} to {httr2} #13

elipousson opened this issue Apr 11, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@elipousson
Copy link

elipousson commented Apr 11, 2023

I'm excited to see a new R package for working with Airtable and I'm definitely interested in contributing! I added some Airtable data access functions to my own getdata package but struggled to get the offset working.

I was wondering if you may be open for a pull request to migrate the package from using {httr} to use {httr2}. I've migrated a few different API packages from {httr} or {jsonlite} to {httr2} over the past year or two and I've found that using {httr2} makes it much easier to add new features and handle rate limiting and authentication. If you're open to it, I'd be happy to do a first pass on migrating the rairtable package over to {httr2}.

@matthewjrogers
Copy link
Owner

Hi @elipousson
That does seem like a worthwhile change. I would be glad to have you open a pull request.

I'd like to keep main current with CRAN, so I'd appreciate if you worked off the code in the dev branch. You'll see that there's some new code in there -- for the next release, I'm planning to explicitly support PATs and have at least one function to leverage the metadata API (currently airtable_base()). I haven't gotten to documenting those functions in detail yet, but I think they're clear enough.

Thanks for the offer and for bringing httr2 to my attention! I'll use that a lot now that I know it exists

@elipousson
Copy link
Author

Awesome. Thanks for being open to the idea. I'll dig into the dev branch and, if it seems like it may make sense to break this issue up into multiple pull requests, I'll be sure to check with you in this discussion before going too far in one direction or another.

@matthewjrogers matthewjrogers added the enhancement New feature or request label Apr 11, 2023
@elipousson
Copy link
Author

I got a bit sucked into this project last night and for a little too much time today as well. A couple of questions/suggestions that I'd like to implement:

  • Can I deprecate the airtable_id_col parameter and replace it with a shorter id_col parameter?
  • Can I deprecate batch_size? Since that limit is built into the API and httr2 handles the rate limiting for requests, I don't think there is a clear reason a user would need to adjust it.

I have everything working with httr2 except insert_records and update_records but I think I'll get those done by Friday. The changes may be a little more extensive than expected but, other than those two suggested changes, all the changes are wholly compatible with the way everything works right now. I also added some tests with testthat before starting so I could make sure everything stayed consistent.

@matthewjrogers
Copy link
Owner

I would prefer keeping airtable_id_col as the parameter name for clarity. In my experience, unambiguous names are worth the extra characters. In this case particularly, I expect that most users are using the default value anyway, and thus are not spelling it out with any great frequency. I think that makes the 'cost' of the extra characters very low.

I'm open to the idea of deprecating batch size, pending a couple of clarifying questions:

  • The batch_size argument is not handling rate limits per se but rather the number of records that are being sent at any one time (in other words, not the speed/cadence of the requests, but the size of the payload). Is httr2 intelligent enough to automatically batch large requests?
  • If so, can you confirm that it would automatically adjust to a higher batch size if such became available? I parameterized the batch size so that in the event Airtable changed their API limits to allow for more simultaneous CRUD operations the user could take advantage of this without the need to update the package.

Thanks for your work on this! When you get to the point of a pull request, I'd also appreciate if you could follow up here with a quick description of your testing process. I haven't yet come up with a clever way to unit/system test this code (if you've got any ideas, I'm all ears)

@elipousson
Copy link
Author

Got it. I'll keep the airtable_id_col in place.

For batch_size and a few other parameters (e.g. api_version), I have been incorporating the use of package options to allow default overrides but mostly tucking those away within functions and setting function parameters to NULL. I'll add documentation specifically for these option setting so you can review (cli and a few other packages use a similar approach to default settings) those all together.

Your existing code for handling batches with vectorized versions of the standard functions made a lot of sense but I'm making some changes to batches size 10 or smaller just go through the regular version. I'm still working my way through the update records and insert records functions but the delete_records code is a good example of where I'm trying to get to for the others. The airtable_request.R file now has all of the main utility functions, esp. req_airtable_query() if you wanted to take a look at how that piece is coming together. The read_airtable() function is also updated to use the httr2 helper functions.

For testing, I have saved environmental variables with the base ID, table ID, and view ID for a test Airtable base and a helper function to skip tests if no PAT or API key is available. httr2 actually has functions for managing secrets specifically to support testing but I may also add tests built with httptest2.

FWIW, I think this may look like an opinionated pull request so I really hope it doesn't feel like I'm stepping on your toes when you see it. Your code makes a ton of sense but, knowing that I have some ideas for using rairtable in some of my own packages and workflows, I'm trying to build in some extra flexibility into the helper functions in anticipation of those needs. There shouldn't be any breaking changes so hopefully it is all just a helpful evolution of the package you already built.

@elipousson
Copy link
Author

I now have insert_records working (httr2 allowed some simplification of the pre-processing for the data to create new records)! airtable_base is also updated. The last function to go is update_records.

You can see the new code for insert_records and in the create_records.R file. I also will need to re-implement the progress bar and parallel parameter option but hope to open a draft pull request before that point so you can review all the major changes.

matthewjrogers added a commit that referenced this issue Sep 16, 2023
Refactor to migrate package to `{httr2}` (#13) and expand implementation of metadata API (#12)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants