RFC: Some refactoring ideas for Storage client library #1943

reuvenlax · 2023-01-19T18:28:46Z

Remove streamWriterToConnection and connectionToWriteStream maps, and instead store this data in the StreamWriter and ConnectionWorker objects themselves. This means that we no longer have to do a map lookup on every call to append().

Instead of using timestamps to determine which StreamWriter objects to send updated schema to, use a registration method. This way only StreamWriters that were created prior to a schema-update callback will get the updated schema (Note: this uses a static map, but could instead be done by updating the StreamWriter directly). I think this preserves the intended semantics from before, but needs a good look. Note: the timestamp approach isn't guaranteed to work, since it's possible for the time to stay the same between StreamWriter creation and the callback (Java does not guarantee that System.nanoTime() actually updates every nanosecond - it provides no guarantee on update frequency).

Some other things worth experimenting with in the future:

See if we can remove the global lock in ConnectionWorkerPool and instead use local locks in StreamWriter and ConnectionWorker. This might reduce lock contention, but will cause us to always grab two locks instead of 1 which might use more CPU. Unclear which approach is better.
Right now ConnectionWorkerPool always checks isOverwhelmed on every call to append and looks for a new stream in that case. If we're in a case where all streams are at their maximum, this might cause a lot of extra CPU usage at the time we can least afford it (when the worker is already overwhelmed!). We should consider throttling this - e.g. maybe we only move a StreamWriter to a new stream if it's been > 1 sec since the last time we checked.
As mentioned above, there are other ways of dealing with updatedSchema.

conventional-commit-lint-gcf · 2023-01-19T18:28:51Z

🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use automerge label. Good luck human!

-- conventional-commit-lint bot
https://conventionalcommits.org/

GaoleMeng · 2023-01-23T18:54:03Z

...e-cloud-bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/StreamWriter.java

@@ -170,6 +184,11 @@ String getWriterId(String streamWriterId) {
      return connectionWorker().getWriterId();
    }

+    public void register(StreamWriter streamWriter) {


Can be removed?

I don't understand the comment?

yirutang · 2023-01-23T18:18:01Z

...bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/ConnectionWorkerPool.java

+        // TODO: What if we simply kept an atomic refcount in ConnectionWorker? We could also
+        // manage the refcount in the callback below to precisely track which connections are being
+        // used.
+        currentConnection.getCurrentStreamWriters().add(streamWriter);


Is this lock good enough to protect currentConnection?

we're using it to protect currentConnection.getCurrentStreamWriters()

In theory we could put a lock inside of currentConnection, which would create more granular locking. However this would also cause a lot more lock/unlock activity (e.g. every call to append would have to lock at least two locks) so this change would need measurement to see if it better better.

yirutang · 2023-01-23T18:18:27Z

...bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/ConnectionWorkerPool.java

-                lock.unlock();
-              }
-            });
+    ConnectionWorker currentConnection;


In general, I like this idea. We can reuse the connection more for the same StreamWriter.

yirutang · 2023-01-23T18:57:39Z

...e-cloud-bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/StreamWriter.java

+    TableSchema getUpdatedSchema(StreamWriter streamWriter) {
+      if (getKind() == Kind.CONNECTION_WORKER) {
+        return connectionWorker().getUpdatedSchema();
+      } else {
        return connectionWorkerPool().getUpdatedSchema(streamWriter);


Does this breaks the promise of StreamWriter only saw updates when there is a schema update? I think we should be fine to use nano time since it is monotonic on this machine?https://screenshot.googleplex.com/3Qeo9ouZEnehgMR

I don't think nanoTime completely works. it is monotonic, but not strictly increasing - i.e. the current code is broken if the first update has the same nanoTime as the creation time, which is completely possible.

yirutang · 2023-01-23T19:00:50Z

...bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/ConnectionWorkerPool.java

+    // TODO: Do we need a global lock here? Or is it enough to just lock the StreamWriter?
+    lock.lock();
+    try {
+      currentConnection = streamWriter.getCurrentConnectionPoolConnection();


Should we keep multiple (at least 2) connections in order to scale up, and avoid look into the global pool?

yirutang · 2023-01-23T19:02:05Z

...bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/ConnectionWorkerPool.java

+        ConnectionWorker createdOrExistingConnection = null;
+        try {
+          createdOrExistingConnection =
+              createOrReuseConnectionWorker(streamWriter, currentConnection);


I think we still need global lock here.

GaoleMeng · 2023-01-23T19:33:00Z

...bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/ConnectionWorkerPool.java

+        currentConnection = createdOrExistingConnection;
+        streamWriter.setCurrentConnectionPoolConnection(currentConnection);
+        // Update connection to write stream relationship.
+        // TODO: What if we simply kept an atomic refcount in ConnectionWorker? We could also


refcount would be error prone as we streamwriter could be switching back and forth between connection workers meaning one worker could be recording a single stream writer multiple times if using refcount

I think that would be fine. The refcount removal would happen in the done callback (below in ApiFutures.transform), so would know exactly which connection worker to decrement even if the stream writer has moved to a different stream.

Refactor

5bd40c8

reuvenlax requested a review from a team January 19, 2023 18:28

reuvenlax requested a review from a team as a code owner January 19, 2023 18:28

reuvenlax requested a review from chalmerlowe January 19, 2023 18:28

product-auto-label bot added size: m Pull request size is medium. api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. labels Jan 19, 2023

reuvenlax changed the title ~~Some refactoring ideas for Storage client library~~ RFC: Some refactoring ideas for Storage client library Jan 19, 2023

reuvenlax marked this pull request as draft January 19, 2023 18:29

GaoleMeng reviewed Jan 23, 2023

View reviewed changes

yirutang reviewed Jan 23, 2023

View reviewed changes

GaoleMeng reviewed Jan 23, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Some refactoring ideas for Storage client library #1943

RFC: Some refactoring ideas for Storage client library #1943

reuvenlax commented Jan 19, 2023

conventional-commit-lint-gcf bot commented Jan 19, 2023 •

edited

GaoleMeng Jan 23, 2023

reuvenlax Jan 30, 2023

yirutang Jan 23, 2023

reuvenlax Jan 30, 2023

yirutang Jan 23, 2023

yirutang Jan 23, 2023

reuvenlax Jan 30, 2023

yirutang Jan 23, 2023

yirutang Jan 23, 2023

GaoleMeng Jan 23, 2023

reuvenlax Jan 30, 2023

RFC: Some refactoring ideas for Storage client library #1943

Are you sure you want to change the base?

RFC: Some refactoring ideas for Storage client library #1943

Conversation

reuvenlax commented Jan 19, 2023

conventional-commit-lint-gcf bot commented Jan 19, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

conventional-commit-lint-gcf bot commented Jan 19, 2023 •

edited