Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor client tracking, fix atomicity, squashing and multi/exec #2970

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

kostasrim
Copy link
Contributor

@kostasrim kostasrim commented Apr 29, 2024

  • add partial support for CLIENT CACHING TRUE (only to be used with TRACKING OPTIN)
  • add OPTIN to CLIENT TRACKING command
  • refactor client tracking to respect transactional atomicity
  • fixed multi/exec and disabled squashing with client tracking
  • add tests

Resolves #2969, #2971, #2997, #2998

P.s. All tests in rueidis TestSingleClientIntegration pass except pub/sub because we don't yet support it see #3001

@kostasrim kostasrim changed the title feat: client tracking optin feat: client tracking optin argument Apr 29, 2024
@kostasrim kostasrim self-assigned this Apr 29, 2024
@kostasrim kostasrim requested a review from dranikpg April 29, 2024 16:05
src/server/main_service.cc Outdated Show resolved Hide resolved
bool optin = false;
// remember if CLIENT CACHING TRUE was the last command
// true if prev command was CLIENT CACHING TRUE
bool prev_command = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain the context behind this state machine with prev_command and last_command variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I will add a comment on the code and one here. So, we need to remember the last command executed because the chore with OPTIN is that if we call CLIENT CACHING TRUE then if the command that follows is a read command we track it.

Basically OPTIN is conditional caching on demand via CLIENT CACHING TRUE which acts as an once flag to potentially(since it might not be read only) track the next command.

I used prev_command and last_command because somehow I had to keep track the last executed command while keeping the change set to the code minimal (within DispatchCommand).

The prev_command is the command executed in the previous call of DispatchCommand
The last_command is the command executed in the current call of DispatchCommand

If last_command is read only && prev_command is CLIENT CACHING TRUE then we need track the command (these are the semantics I described above). At the end of the function DisapatchCommand, prev_command is updated to the 'last_commandandlast_commandis set to false (and only set totrue` when client caching true is called).

p.s. The bonus is that this only requires to set the last_command to true on the call site of CLIENT CACHING TRUE avoiding us to explicitly set it to false for all other commands. We could do something similar by checking the cid and args in DispatchCommand but this introduces another layer of command parsing (name + args) on the dispatch layer which I am not happy about and for that reason I did not pursue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this struct, maybe client tracking shouldn't be part of the connection code at all, like pubsub or monitoring? We can store information about it in connection state - like we do for other stuff. To track optin, we can introduce a seqnum for every command, and ClientCaching() will just do conn_state.tracking.optin_seqnum = cntx->current_seqnum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe, in general I would like to refactor some parts of it. Shall we do this on separate PR to at least bake the basic functionality in ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like you wish, but I think it can be done here 🤷🏻‍♂️

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kostasrim I suspect that the state machine can be simplified, i.e. instead of last and prev fields, to have only
next_command_tracking. ClientCaching will set next_command_tracking and DispatchCommand will unconditionally exchange this var with false before executing the command. And I agree with @dranikpg that this should be part of the context state like db_index or multi/exec state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dranikpg @romange I refactored the code. However I still do think we need the two variables since there are two commands we need to track (the one executing at the moment via InvokeCmd and the one executed before it such that we can check if the last command was CACHING YES). I added a bunch of comments in the state maching, if you still think it';s doable with a single variable I am happy to reiterate :) It's a small change anyway :)

@kostasrim kostasrim changed the title feat: client tracking optin argument refactor client tracking, fix atomicity, squashing and multi/exec May 2, 2024
ec.await(
[this] { return subscriber_bytes.load(memory_order_relaxed) <= subscriber_thread_limit; });
ec.await([this] {
return done || subscriber_bytes.load(memory_order_relaxed) <= subscriber_thread_limit;
Copy link
Contributor Author

@kostasrim kostasrim May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we had a deadlock if this was waiting while the connection fiber was joining during shutdown

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only call I find is here https://github.com/dragonflydb/dragonfly/blob/main/src/facade/dragonfly_connection.cc#L1720

where a connection A waits for the connection B to be below the limit.
can you describe the deadlock what connection was joining during shutdown?

this fix is most probably incorrect because done is not atomic var and this is a multi-threaded function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We drain the queue when we close a connection, so the limit should drop. Yes, it's called from multiple threads and for multiple connections

A deadlock might indeed be possible if two connections try to send messages to each other and then close 🤔 Rare case, but worth investigating. Either way a done flag won't do it

@@ -1114,7 +1117,7 @@ void Connection::HandleMigrateRequest() {
this->Migrate(dest);
}

DCHECK(dispatch_q_.empty());
// DCHECK(dispatch_q_.empty());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have triggered this and I suspect it's not needed, since by the time Migrate returns dispatch_q_ might be active

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dranikpg to validate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with the new update where we don't loose track of connections, we might have gotten a message. But I'm not sure this feature was implemented correctly (need to check), because we might end up with the dispatch fiber on the wrong thread

Instead, we should be open to messages, but should start the dispatch fiber strictly on the correct thread

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe uncomment this as it's not related

@@ -838,13 +838,23 @@ OpStatus Transaction::ScheduleSingleHop(RunnableType cb) {

// Runs in coordinator thread.
void Transaction::Execute(RunnableType cb, bool conclude) {
auto tracking_wrap = [cb, this](Transaction* t, EngineShard* shard) -> RunnableResult {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dranikpg this with the changes in InvokeCmd seemed to be the most non intrusive way (to comply with the requirements of the state machine)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would like to understand why tracking requires transaction semantics (an example will be fine)
  2. why cid_ is not enough and we need invoke_cid_ ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would suggest to do it like we handle blocking commands - once it finished and manually from RunSquashedCb

if (auto* bcontroller = shard->blocking_controller(); bcontroller) {
if (awaked_prerun || was_suspended) {
bcontroller->FinalizeWatched(GetShardArgs(idx), this);

So it becomes if (concluding || (multi && multi_->concluding)) Track(this)

Now you don't need invoke_cid, etc there as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to understand why tracking requires transaction semantics (an example will be fine)

Because invalidation messages must be sent before the transaction concludes. Otherwise, we might accidentally skip them. An example would be:

>> CLIENT TRACKING ON
>> GET FOO
>> SET FOO BAR 
>> GET FOO
>> SET FOO BAR
>> GET FOO ---------> might miss Invalidation message

A valid execution would be once we call the first SET we will send an invalidation message as a separate transaction. Now before that even starts/concludes, the GET that follows will get executed first and it will itself issue a separate transaction to send an invalidation message. Now the problem here is, that once we send an invalidation message we remove the key from the tracking map (since we only send invalidation messages once until the key is reread). Then the second invalidation transaction won't work because the key no longer exists in the map and we will never get that second invalidation message.

@@ -119,6 +119,13 @@ void ConnectionContext::ChangeMonitor(bool start) {
EnableMonitoring(start);
}

ConnectionState::ClientTracking& ConnectionContext::ClientTrackingInfo() {
if (parent_cntx_) {
return parent_cntx_->conn_state.tracking_info_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That;s for squashing :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you access conn_state, don't make it a function on conn_cntx

then you can just use conn_state, you can make it mutable or add a new member like conn

if (cntx->conn_state.squashing_info)
cntx = cntx->conn_state.squashing_info->owner;

@@ -885,6 +886,8 @@ void Connection::ConnectionFlow(FiberSocketBase* peer) {
// After the client disconnected.
cc_->conn_closing = true; // Signal dispatch to close.
evc_.notify();
queue_backpressure_->done = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I suggest dchecking on the value of subscriber_bytes first. it should be 0 when connection is shutting down and if it's not we have other bugs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something's wrong here, queue_backpressure_ is a per-thread instance

// Sets to true when CLIENT TRACKING is ON
void SetClientTracking(bool is_on);
// Enable tracking on the client
void TrackClientCaching();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much much clearer now!!!

void ConnectionState::ClientTracking::Track(ConnectionContext* cntx, const CommandId* cid) {
auto& info = cntx->ClientTrackingInfo();
auto shards = cntx->transaction->GetActiveShards();
if ((cid->opt_mask() & CO::READONLY) && cid->IsTransactional() && info.ShouldTrackKeys()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dranikpg pls review

}
auto& client_set = it->second;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider client_tracking_map_.extract(key) function that can combine find and delete in one call.

}
auto& client_set = it->second;
// notify all the clients.
auto cb = [key = std::string(key), client_set = std::move(client_set)](unsigned idx,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a TODO comment about client_set, key being duplicated because we broadcast cb

@@ -1206,6 +1186,7 @@ void Service::DispatchCommand(CmdArgList args, facade::ConnectionContext* cntx)
if (stored_cmd.Cid()->IsWriteOnly()) {
dfly_cntx->conn_state.exec_info.is_write = true;
}
dfly_cntx->conn_state.tracking_info_.UpdatePrevAndLastCommand();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to call it here?

}

void ConnectionState::ClientTracking::UpdatePrevAndLastCommand() {
if (prev_command_ && multi_) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that what you really want is to know if you are in the middle of EXEC execution and not multi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We store so much fragile info that needs to be updated everywhere... seqnums would solve all this

// Enable tracking on the client
void TrackClientCaching();

void UpdatePrevAndLastCommand();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: UdatePrevAndLastCommand name describes the implementation of this function. What it does is advancing the state. So I think it's better call it Tick or Advance or Update

// true if the previous command invoked is CLIENT CACHING TRUE
bool prev_command_ = false;
// true if the currently executing command is CLIENT CACHING TRUE
bool executing_command_ = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename: executing_command_ to track_next_cmd_

Copy link
Contributor Author

@kostasrim kostasrim May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the track_next_cmd_ seems misleading since it implies it's the next command. executing_command_ is the command we currently execute in InvokeCmd flow and prev_command_ is the command before it. So:

>> GET FOO ----> prev_command
>> GET BAR ----> current_command

bool optin_ = false;
// remember if CLIENT CACHING TRUE was the last command
// true if the previous command invoked is CLIENT CACHING TRUE
bool prev_command_ = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename prev_command_ to track_current_cmd_

Copy link
Contributor

@dranikpg dranikpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks somewhat over engineered to me 😅 We have similar semantics in blocking commands - only that we subscribe with specific commands and not with any.

  1. Let's use squashing_info instead of adding a parent field to ConnectionContext or let's use that field for everything - there should be one way of doing things with proper comments, so nobody adds yet a third
  2. I'd still suggest to add numbers to commands, because UpdatePrevAndLastCommand() appears in many places and we update three whole fileds: prev, executing, multi. The track command can just store its number and we don't have to update much more
  3. Track() should be called when we conclude or finish the current multi command, currently we call it for every hop. Not that there are multi-hop read commands, but I think it belongs to all other management code. Invoke-cid should also not be needed with that

@@ -885,6 +886,8 @@ void Connection::ConnectionFlow(FiberSocketBase* peer) {
// After the client disconnected.
cc_->conn_closing = true; // Signal dispatch to close.
evc_.notify();
queue_backpressure_->done = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something's wrong here, queue_backpressure_ is a per-thread instance

ec.await(
[this] { return subscriber_bytes.load(memory_order_relaxed) <= subscriber_thread_limit; });
ec.await([this] {
return done || subscriber_bytes.load(memory_order_relaxed) <= subscriber_thread_limit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We drain the queue when we close a connection, so the limit should drop. Yes, it's called from multiple threads and for multiple connections

A deadlock might indeed be possible if two connections try to send messages to each other and then close 🤔 Rare case, but worth investigating. Either way a done flag won't do it

@@ -1114,7 +1117,7 @@ void Connection::HandleMigrateRequest() {
this->Migrate(dest);
}

DCHECK(dispatch_q_.empty());
// DCHECK(dispatch_q_.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with the new update where we don't loose track of connections, we might have gotten a message. But I'm not sure this feature was implemented correctly (need to check), because we might end up with the dispatch fiber on the wrong thread

Instead, we should be open to messages, but should start the dispatch fiber strictly on the correct thread

@@ -838,13 +838,23 @@ OpStatus Transaction::ScheduleSingleHop(RunnableType cb) {

// Runs in coordinator thread.
void Transaction::Execute(RunnableType cb, bool conclude) {
auto tracking_wrap = [cb, this](Transaction* t, EngineShard* shard) -> RunnableResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would suggest to do it like we handle blocking commands - once it finished and manually from RunSquashedCb

if (auto* bcontroller = shard->blocking_controller(); bcontroller) {
if (awaked_prerun || was_suspended) {
bcontroller->FinalizeWatched(GetShardArgs(idx), this);

So it becomes if (concluding || (multi && multi_->concluding)) Track(this)

Now you don't need invoke_cid, etc there as well

@@ -119,6 +119,13 @@ void ConnectionContext::ChangeMonitor(bool start) {
EnableMonitoring(start);
}

ConnectionState::ClientTracking& ConnectionContext::ClientTrackingInfo() {
if (parent_cntx_) {
return parent_cntx_->conn_state.tracking_info_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you access conn_state, don't make it a function on conn_cntx

then you can just use conn_state, you can make it mutable or add a new member like conn

if (cntx->conn_state.squashing_info)
cntx = cntx->conn_state.squashing_info->owner;

}

void ConnectionState::ClientTracking::UpdatePrevAndLastCommand() {
if (prev_command_ && multi_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We store so much fragile info that needs to be updated everywhere... seqnums would solve all this

Comment on lines 258 to 260
ConnectionContext* parent_cntx_ = nullptr;

ConnectionState::ClientTracking& ClientTrackingInfo();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment on whether we can keep this in conn_state

@dranikpg
Copy link
Contributor

dranikpg commented May 2, 2024

there is a deeper problem with CLIENT TRACKING OPTIN

suppose you have

MULTI
CLIENT TRACKING OPTIN
GET A
GET B 
GET C
EXEC

Now, if you squash the last 3, you loose the order - becuase GET C can run first of those three

@romange
Copy link
Collaborator

romange commented May 2, 2024

lets reject CLIENT TRACKING OPTIN in multi

@romange
Copy link
Collaborator

romange commented May 2, 2024

in fact, should we even allow CLIENT commands inside MULTI?

@kostasrim
Copy link
Contributor Author

in fact, should we even allow CLIENT commands inside MULTI?

Only CLIENT CACHING YES as it has specific semantics

@kostasrim kostasrim requested a review from dranikpg May 8, 2024 16:42
dranikpg
dranikpg previously approved these changes May 9, 2024
Copy link
Contributor

@dranikpg dranikpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits remaining, LGTM

Comment on lines +268 to +274
bool ConnectionState::ClientTracking::ShouldTrackKeys() const {
if (!IsTrackingOn()) {
return false;
}

return !optin_ || (seq_num_ == (1 + caching_seq_num_));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's what I call elegancy 🎩

Comment on lines -2183 to +2164
if (absl::GetFlag(FLAGS_multi_exec_squash) && state == ExecEvalState::NONE) {
if (absl::GetFlag(FLAGS_multi_exec_squash) && state == ExecEvalState::NONE &&
!cntx->conn_state.tracking_info_.IsTrackingOn()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it gets enabled while executing

if ((cid->opt_mask() & CO::READONLY) && cid->IsTransactional() && info.ShouldTrackKeys()) {
auto conn = cntx->parent_cntx_ ? cntx->parent_cntx_->conn()->Borrow() : cntx->conn()->Borrow();
auto cb = [&, conn](unsigned i, auto* pb) {
if (shards.find(i) != shards.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: There is IsActive() so you don't need GetActiveShards()

Comment on lines 366 to 368
void SetConnectionContextAndInvokeCid(ConnectionContext* cntx) {
cntx_ = cntx;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please rename it then

@@ -183,6 +268,8 @@ class ConnectionContext : public facade::ConnectionContext {
// TODO: to introduce proper accessors.
Transaction* transaction = nullptr;
const CommandId* cid = nullptr;
ConnectionContext* parent_cntx_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this now if we don't support squashing

auto& info = cntx->conn_state.tracking_info_;
auto shards = cntx->transaction->GetActiveShards();
if ((cid->opt_mask() & CO::READONLY) && cid->IsTransactional() && info.ShouldTrackKeys()) {
auto conn = cntx->parent_cntx_ ? cntx->parent_cntx_->conn()->Borrow() : cntx->conn()->Borrow();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, if we don't support squashing, let's dcheck conn_state.squashing_info is false

Comment on lines +650 to +651
ConnectionContext* cntx_{nullptr};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll invent something in the future to get rid of this 😆

@@ -1114,7 +1117,7 @@ void Connection::HandleMigrateRequest() {
this->Migrate(dest);
}

DCHECK(dispatch_q_.empty());
// DCHECK(dispatch_q_.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe uncomment this as it's not related

<< " with thread ID: " << conn_ref.Thread();

auto& db_slice = slice_args.shard->db_slice();
// TODO: There is a bug here that we track all arguments instead of tracking only keys.
Copy link
Contributor

@dranikpg dranikpg May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, found it, yes that's also left 🙂

LogAutoJournalOnShard(shard, result);
if (cntx_) {
cntx_->conn_state.tracking_info_.Track(cntx_, cid_);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, @dranikpg @kostasrim would you mind providing an example or an explanation why triggering from the transaction is needed. I think this feature is eventually consistent by definition so I do not understand the reason for this change.

Copy link
Contributor

@dranikpg dranikpg May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean... It has to be tracked with keys locked so we don't miss the next immediate update

It is eventually consistent, but does this justify missed updates?. If you missed the only update that occured immediately after, your state will diverge until the second update, which might be not soon

Technically it doesn't have to be tracked from transaction. We have the journal for that. But it has to be tracked during the transaction phase

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not argue, just trying to understand. What do you mean by "missing the next update" ?
can you please give me an example? suppose we won't do it withing the transaction phase, please provide the flow of steps that will lead to incorrect result as seen by a client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client A: ->  READ A ->                  -> TRACK A
Client B: ->             -> WRITE A ->                  -> DO NOTHING

Client A thinks A is still up-to-date whereas it's not

Copy link
Collaborator

@romange romange May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so if A is being read and the rule is to track keys that are being read you should start tracking A immediately.

Got ya. But the whole code around this back and forth looks wrong:
Inside a shard callback we access connection state that is not designed to be accessed from the shard threads.
Specifically, Track function checks if ((cid->opt_mask() & CO::READONLY) && cid->IsTransactional() && info.ShouldTrackKeys()) from the shard thread and then calls AwaitFiberOnAll(std::move(cb)); on all shards.

Does it mean that if we call MGET on 20 shards, now all 20 shards will call Track that checks if we should track and then calls back AwaitFiberOnAll on all shards, overall 20x20 calls? Do I understand this interaction correctly?

and if we never enable tracking we still call Track for every transaction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes, after I asked Kostas to move the code, I didn't check it's correct 😅 We are already on the thread, we shouldn't be callin AwaitFiberOnAll

Does it mean that if we call MGET on 20 shards, now all 20 sh

It should be the same as journaling. It is in fact journaling (or replication), just with a filter that sends invalidation except the new data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for CLIENT OPTIN
3 participants