Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API to update multiple counters at once #35

Open
saintthor opened this issue Aug 15, 2013 · 19 comments
Open

Add API to update multiple counters at once #35

saintthor opened this issue Aug 15, 2013 · 19 comments

Comments

@saintthor
Copy link

for a mass data, i use batches to write lines. it is efficient. but the counters take too much time. is there any way to set counters in batch?

@wbolster
Copy link
Member

Unfortunately the Thrift API does not support it. :-(

— Wouter

(Sent from my phone. Please ignore the typos.)

saintthor notifications@github.com schreef:

for a mass data, i use batches to write lines. it is efficient. but the
counters take too much time. is there any way to set counters in batch?


Reply to this email directly or view it on GitHub:
#35

@nkeyes
Copy link

nkeyes commented Aug 27, 2013

The latest Hbase.thrift definition file has a new method incrementRows that might be what @saintthor is looking for. It allows you to send a list of TIncrements in one request.

http://svn.apache.org/viewvc/hbase/tags/0.95.2/hbase-server/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#69 (line 663)

@saintthor
Copy link
Author

thank you.
wbolster, is it possible to add this feature to happybase for hbase version 0.90?

@wbolster wbolster reopened this Aug 27, 2013
@wbolster
Copy link
Member

@saintthor If it's added in recent versions, there is no way this can be supported in the much older HBase 0.90 release.

@wbolster
Copy link
Member

Thanks @nkeyes for the pointer.

It looks like the increment() function in the Thrift API doesn't add anything over the atomicIncrement() function. The latter does return the counter value, while the new API does not. Strange.

The other function is incrementRows(), which actually adds previously unavailable functionality. The API doesn't seem to support using this with a write batch (Table.batch()) though. This Thrift API is currently (HappyBase 0.6) not exposed. Any thoughts on what would be the most sensible API to have in HappyBase's Table class?

(The increment value for the TIncrement struct is misspelled as 'ammount', btw.)

@nkeyes
Copy link

nkeyes commented Aug 28, 2013

Here's how I did it in ok_hbase:
nkeyes/ok_hbase@5ab526b

I added methods on the Connection, Table, and Row classes:

  • Row#increment sets the row_key on the TIncrement to the current Row instance then proxies to Table#increment
  • Table#increment and Table#increment_rows set the table_name on the TIncrements to the current Table instance and then proxy to Connection#increment and Connection#increment_rows
  • Connection#increment and Connection#increment_rows accept hash representations of TIncrements, allowing for either amount, or the typo 'ammount'

You're right, there doesn't appear to be a way to use increment or incrementRows within a batch request, but incrementRows is sort of a batch request in itself.

I think of incrementRows as a special case, and if the developer decides they need it, they understand that it is not batchable with other mutations.

@wbolster
Copy link
Member

Relevant Thrift API:

/**
 * For increments that are not incrementColumnValue
 * equivalents.
 */
struct TIncrement {
  1:Text table,
  2:Text row,
  3:Text column,
  4:i64  ammount
}

  /**
   * Increment a cell by the ammount.
   * Increments can be applied async if hbase.regionserver.thrift.coalesceIncrement is set to true.
   * False is the default.  Turn to true if you need the extra performance and can accept some
   * data loss if a thrift server dies with increments still in the queue.
   */
  void increment(
    /** The single increment to apply */
    1:TIncrement increment
  ) throws (1:IOError io)


  void incrementRows(
    /** The list of increments */
    1:list<TIncrement> increments
  ) throws (1:IOError io)

@wbolster
Copy link
Member

Thanks @nkeyes for the explanation.

@wbolster
Copy link
Member

Do you happen to know which version introduced this API? Currently HappyBase has '0.90' and '0.92' compatibility modes; this probably needs to be extended when adding API only available in more recent versions.

@nkeyes
Copy link

nkeyes commented Aug 28, 2013

I think it came wit 0.94, I don't see it in the 0.92 docs.

0.92 (candidate?):
http://people.apache.org/~stack/hbase-0.92.2-candidate-0/hbase-0.92.2/docs/apidocs/index.html
0.94:
https://hbase.apache.org/0.94/apidocs/index.html

@saintthor
Copy link
Author

what a regret. thanks.

@saintthor
Copy link
Author

@wbolster, have you added the feature that writing counters in batch? is it for hbase 0.92 or 0.94? is there docs or examples?

@wbolster
Copy link
Member

No, there is no code for this currently.

It would require a new compat mode, a new BatchIncrement class, like the current Batch class, tests, and updated docs, but I haven't found time/motivation yet to work on this... so much else to do. :)

@saintthor
Copy link
Author

it is all python codes to add this feature, right? i may try it.

@wbolster
Copy link
Member

Yes it is, but it may require regenerating the Thrift code using a newer .thrift definition file.

— Wouter

(Sent from my phone. Please ignore the typos.)

saintthor notifications@github.com schreef:

it is all python codes to add this feature, right? i may try it.


Reply to this email directly or view it on GitHub:
#35 (comment)

@saintthor
Copy link
Author

do you mean to generate the hbase module with thrift?

2013/10/25 Wouter Bolsterlee notifications@github.com

Yes it is, but it may require regenerating the Thrift code using a newer
.thrift definition file.

— Wouter

(Sent from my phone. Please ignore the typos.)

saintthor notifications@github.com schreef:

it is all python codes to add this feature, right? i may try it.


Reply to this email directly or view it on GitHub:
#35 (comment)


Reply to this email directly or view it on GitHubhttps://github.com//issues/35#issuecomment-27073607
.

@wbolster
Copy link
Member

Yes, because this requires API that was not available when the current version was generated.

— Wouter

(Sent from my tablet. Please ignore the typos.)

saintthor notifications@github.comschreef:

do you mean to generate the hbase module with thrift?

2013/10/25 Wouter Bolsterlee notifications@github.com

Yes it is, but it may require regenerating the Thrift code using a newer
.thrift definition file.

— Wouter

(Sent from my phone. Please ignore the typos.)

saintthor notifications@github.com schreef:

it is all python codes to add this feature, right? i may try it.


Reply to this email directly or view it on GitHub:
#35 (comment)


Reply to this email directly or view it on GitHubhttps://github.com//issues/35#issuecomment-27073607
.


Reply to this email directly or view it on GitHub.

@wbolster
Copy link
Member

wbolster commented Nov 3, 2013

Fwiw, I've just updated the Thrift API bundled with HappyBase in dd7878c, so the new Thrift API is now available inside HappyBase (but not exposed in the public API).

@saintthor
Copy link
Author

i have added the method in happybase connection module last week.

class Connection(object):
def batch_inc( self, IncItems ):
"inc counters via batch"
Increments = [TIncrement( *item ) for item in IncItems] #item = [tableName, rowKey, counterName, counterValue]
self.client.incrementRows( Increments )

it works well. if you think it is ok. please add it into happybase. this will make our deployment simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants