Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoints may miss events during initial config sync/replay logs #10030

Open
yhabteab opened this issue Mar 22, 2024 · 0 comments
Open

Endpoints may miss events during initial config sync/replay logs #10030

yhabteab opened this issue Mar 22, 2024 · 0 comments
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working

Comments

@yhabteab
Copy link
Member

Describe the bug

When the master has to replay a huge number of logs, which can take up to two to three minutes or even more, the remote peer will never receive live generated events from the local endpoint and may lose important events.

ApiListener::RelayMessageOne() determines, as stated in its doc block, whether a message has been successfully relayed or not. This allows the caller to decide whether the message should be saved to disk or not (replay log).

* @return true if the message has been relayed to all relevant endpoints,
* false if it hasn't and must be persisted in the replay log
*/
bool ApiListener::RelayMessageOne(const Zone::Ptr& targetZone, const MessageOrigin::Ptr& origin, const Dictionary::Ptr& message, const Endpoint::Ptr& currentZoneMaster)

Once it reaches this point, it is considered as being relayed yet before it calls ApiListener::SyncSendMessage(). Keep in mind, it hasn't been checked whether the target endpoint is in syncing state so far.

relayed = true;
SyncSendMessage(targetEndpoint, message);

ApiListener::SyncSendMessage() ultimately checks whether the message should be forwarded or not, i.e. when the endpoint is in syncing state, it will simply ignore this message and never try it again. Unless there are unconnected endpoints, this message will not be cached and the endpoint will never get to see this event.

void ApiListener::SyncSendMessage(const Endpoint::Ptr& endpoint, const Dictionary::Ptr& message)
{
ObjectLock olock(endpoint);
if (!endpoint->GetSyncing()) {

@yhabteab yhabteab added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant