Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support batch operations #66

Open
EvanHahn opened this issue Sep 17, 2019 · 12 comments
Open

Support batch operations #66

EvanHahn opened this issue Sep 17, 2019 · 12 comments

Comments

@EvanHahn
Copy link

The Airtable API now supports creating, updating, and deleting up to 10 records per request. Is this something Airrecord should support?

The documentation at airtable.com/api shows examples. I was the main developer working on this feature at Airtable, so let me know if you have any questions about this!

(I don't personally need this feature for myself, so feel free to close if you wish!)

@sirupsen
Copy link
Owner

That's cool. Someone who stumbled upon this ticket should consider implementing that. But I'd suggest going a bit back and forth here on a good API first!

@Meekohi
Copy link
Collaborator

Meekohi commented Jan 20, 2020

Yeah agreed, good idea and could be a non-trivial performance improvement. The project I was working on that uses airrecord is on pause, but if it comes back to life I might take a stab at this.

@sirupsen
Copy link
Owner

Also just saw that Airtable has a metadata API now. That could be really cool to work on Airrecord integration for. Opens up a lot of possibilities where it's harder to do the wrong thing.

@aguynamedben
Copy link

aguynamedben commented Apr 17, 2020

I want this for batched updates and might be able to implement it this weekend. How about this API:

tea1 = Tea.find("someid")
tea1["Name"] = "Feng Gang Organic"
tea1["Village"] = "Feng Gang"

tea2 = Tea.find("someotherid")
tea2["Name"] = "Sweet Tea"
tea2["Village"] = "Georgia, USA"

Tea.update_batch([tea1, tea2])

An error would be throw if the array provided has more than 10 records.

It looks like batches of 10 are supported for create, update, and delete. I can't tell if their update API does an "upsert" if a new record is given, or throws an errror. I propose I will first implement Airrecord::Table#update_batch first, then myself of someone else can add Airrecord::Table#create_batch and Airrecord::Table#delete_batch.

Does that work? The API is somewhat similar to ActiveRecord batch queries, but isn't chainable.

@sirupsen
Copy link
Owner

That seems sensible to me. Not crazy about the name, but I think this is the simplest.

@aguynamedben
Copy link

Any ideas for a better name? I'm not partial the the name either.

@sirupsen
Copy link
Owner

sirupsen commented Apr 18, 2020

Nah. Let's not sweat it. The functionality would be the same. Introducing relations is not worth it. This maps to the Airtable naming, so let's do what you suggested :) I suspect it won't be very difficult.

@sirupsen
Copy link
Owner

sirupsen commented Apr 18, 2020

Maybe one thing I'd suggest is to just call it update, and then also have a update_or_create, as well as a create which can also take an enum.

@aguynamedben
Copy link

After working on this some, I realized that my project would really need Table.batch_upsert to create or update in batches accordingly. For my immediate needs, I've decided to do an after_save callback that makes API calls on a per-record basis.

However, I did get Table.batch_update working, and submitted a proof-of-concept pull request here with some more ideas on how to run with it if anybody wants to take it from there.

@cloudsbird
Copy link
Contributor

Hi, any updates regarding this?

@JacksonRiso
Copy link

JacksonRiso commented Jun 18, 2023

Here is my janky, but workable code to add a batch upsert functionality to the Airrecord gem. Add this to your config/airrecord.rb file:

Airrecord::Table.class_eval do
  # Upsert up to 10 records
  def self.bulk_record_upsert(field_to_merge_on = nil, record = nil, last = false)
    response = nil
    @queue ||= {}
    @queue["#{base_key}-#{client.escape(table_name)}"] ||= []
    this_queue = @queue["#{base_key}-#{client.escape(table_name)}"]
    this_queue.push({ 'fields' => record }) if record
    # Each loop - every 10th iteration or last!
    if this_queue.length > 0 && (this_queue.length % 10 == 0 || last)
      this_queue.each_slice(10).each do |array|
        # Send API call to Airtable
        path = "https://api.airtable.com/v0/#{base_key}/#{client.escape(table_name)}"
        body = {
          "performUpsert": {
            "fieldsToMergeOn": [field_to_merge_on]
          },
          "records": array.uniq
        }
        headers = {
          "Content-Type": 'application/json',
          "Authorization": "Bearer #{api_key}"
        }
        response = HTTParty.patch(path, headers: headers, body: body.to_json)
        Sentry.capture_message(response.body, level: :error) if response.code != 200
      end
      # Empty the queue
      this_queue = []
      @queue["#{base_key}-#{client.escape(table_name)}"] = []
    end
    response
  end
end

If working in a loop, it will wait until it gets called 10 times, before sending the API call. It has drastically improved my performance. To ensure no stragglers get left behind, I implement it something like this;

(1..13).each do |i|
   Table.bulk_record_upsert("COLUMN NAME", {column: i}) 
end
Table.bulk_record_upsert("COLUMN NAME", nil, true) 

This will call the update API 2x --> once on the 10th record being added, and once after the loop is over for the final 3 stragglers.

Let me know if this makes sense and if anyone has any questions! Of course I would love to see this tested & moved into a pull request.

Cheers :-D

@JacksonRiso
Copy link

JacksonRiso commented Jun 18, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants