New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency guarantees of connection events #1013
Comments
It's not really that simple though. Most plugins do not even pretend to handle these cases. As for If you want to be absolutely certain you are dealing with a specific This is a consequence of prioritizing resource clean up, Velocity reflecting the state of the world as it is at that moment, an acknowledgment that user plugins are often poorly written, a consequence of operating in a multi-threaded environment. We don't offer guarantees, because it is difficult for us to provide one without compromising somewhere. If you want a guarantee, you have to do it within your plugin. |
I understand that its not that simple and to be honest I'm expecting little to be done to this due to it requiring several changes how things work internally. I'm fine with this but unfortunately this isn't even as simple to handle in the plugin which ended up me raising this issue. It can done, yes, and I'm going to over few solutions and quirks they have. For me I would prefer the correctness as its unlikely to hinder the typical use case while protecting against possible bugs beyond the ones I can control while massively simplifying the event handling in the plugin side. This is going to get long, but lets look at examples. Its way easier to demonstrate the trickery you have to do to actually get this right. Lets start with the case that plugins typical tend to use. private ConcurrentMap<UUID, User> users;
public void onLoginEvent(LoginEvent event) {
this.users.put(event.getPlayer().getUniqueId(), new User());
}
public void onDisconnectEvent(DisconnectEvent event) {
this.users.remove(event.getPlayer().getUniqueId());
} The code suffers from a race where the disconnection can clear the user from the map after the login even has completed. To fix this, we can use the private Cache<Player, User> users; //Weak keys
public void onLoginEvent(LoginEvent event) {
final User user = new User();
this.users.put(event.getPlayer(), user);
if (!event.getPlayer().isActive()) {
this.users.invalidate(event.getPlayer());
}
}
public void onDisconnectEvent(DisconnectEvent event) {
this.users.invalidate(event.getPlayer());
} This works fine but we are limited to accessing the map by private Cache<Player, User> users; //Weak keys, eviction listener to clear usersByUuid on eviction just in case
private ConcurrentMap<UUID, User> usersByUuid;
public void onLoginEvent(LoginEvent event) {
final User user = new User();
this.users.put(event.getPlayer(), user);
this.usersByUuid.put(event.getPlayer().getUniqueId(), user);
if (!event.getPlayer().isActive()) {
this.users.invalidate(event.getPlayer());
this.usersByUuid.remove(event.getPlayer().getUniqueId(), user);
}
}
public void onDisconnectEvent(DisconnectEvent event) {
final User user = this.users.asMap().remove(event.getPlayer());
this.usersByUuid.remove(user.getUniqueId(), user);
} Now this looks good, or does it? The answer is a no! Consider the following scenario: LoginEvent A fired -> LoginEvent B fired -> LoginEvent B completed -> LoginEvent A completes. Now we have the reference of the old private Cache<Player, User> users; //Weak keys, eviction listener to clear usersByUuid on eviction in case
private ConcurrentMap<UUID, User> usersByUuid;
public void onLoginEvent(LoginEvent event) {
final User user = new User();
if (this.usersByUuid.putIfAbsent(event.getPlayer().getUniqueId(), user) != null) {
event.setResult(ResultedEvent.ComponentResult.denied(Component.text("You are still logged in.")));
return;
}
this.users.put(event.getPlayer(), user);
if (!event.getPlayer().isActive()) {
this.users.invalidate(event.getPlayer());
this.usersByUuid.remove(event.getPlayer().getUniqueId(), user);
}
}
public void onDisconnectEvent(DisconnectEvent event) {
final User user = this.users.asMap().remove(event.getPlayer());
this.usersByUuid.remove(user.getUniqueId(), user);
} Unfortunately after going all that trouble there is still one problem to tackle. What if you are providing an API to retrieve the user by using something like Let's go with the the event based system. If we just fire them like Velocity does its connection events the 3th party plugin also needs to handle these racy conditions for the What if we just bite the bullet and introduce a This does get a fairy bit more complicated when you start to introduce the I/O calls so you most likely want to change your Hopefully this shines some light on what I'm mostly concerned about and my thought process on how to solve that on the plugin side itself. It is not perfect and getting some help and heavy lifting from the proxy side would be nice to have but its understandable if the current design is going to stay, I can live with it. SIDE NOTE: This problem theoretically also exists for other async events but I'm not concerned about those. You can quite easily work this around by using semaphore and comparing the player references if that is absolutely necessary. |
So, for your last code sample, you really should've solved it like this: private Cache<Player, User> users; //Weak keys, eviction listener to clear usersByUuid on eviction in case
private ConcurrentMap<UUID, User> usersByUuid;
public void onLoginEvent(LoginEvent event) {
final User user = new User();
if (this.usersByUuid.putIfAbsent(event.getPlayer().getUniqueId(), user) != null) {
event.setResult(ResultedEvent.ComponentResult.denied(Component.text("You are still logged in.")));
return;
}
this.users.put(event.getPlayer(), user);
if (!event.getPlayer().isActive()) {
this.users.invalidate(event.getPlayer());
this.usersByUuid.remove(event.getPlayer().getUniqueId(), user);
}
}
public void onDisconnectEvent(DisconnectEvent event) {
final User user = this.users.asMap().remove(event.getPlayer());
// only remove from usersByUuid if the user instance equals the one for the specific player
this.usersByUuid.remove(user.getUniqueId(), user);
} Your current code has a bug in the case where But more to the point, I don't have a good answer other than to isolate "online player state" from "offline player state" (which would be available irregardless of whether the player is actually online), and to not solely rely on a player being online to populate its current state, to minimize the blast radius here. That was the approach I wound up taking with the last server I worked with (Mineteria). |
Of course, you could split the difference: private Cache<Player, User> users; //Weak keys, eviction listener
private ProxyServer server;
public void onLoginEvent(LoginEvent event) {
final User user = new User();
this.users.put(event.getPlayer(), user);
}
public void onDisconnectEvent(DisconnectEvent event) {
this.users.invalidate(event.getPlayer());
}
public Optional<User> getOnlinePlayer(final String username) {
return server.getPlayer(username).map(this.users::getIfPresent);
}
public Optional<User> getOnlinePlayer(final UUID uuid) {
return server.getPlayer(uuid).map(this.users::getIfPresent);
} But you're still stuck with separating that "offline player state" either way. |
Ah, thanks. Seems like I forgot it from the disconnection events.
In my case the problem is more that there is no offline data to populate and more that its cleared and saved on disconnection. So when we accept new connection we should have empty state but without truncating the already existing session in case it has not been saved yet due to the delays in the event handling. |
So use the |
I know, hence the issue. There is no way to make this work with 3th party plugins unless you provide your own events and handle the synchronization yourself by blocking the logins and delaying disconnect events as I mentioned previously. If there are no plans to alter the way these events are fired and the information I provided does not change that then we can go ahead and close this issue. I can deal with this myself in the plugin in the best possible way that I can to try to make this work. I'm also happy to contribute any changes to these myself. |
@aromaa any news about this? I think it is quite embarrassing to have to do all that for a simple data loading/unloading of the player in a safe way (which still is not despite the attempts). |
What I understood from the above conversation was that there was not much of interest to make changes to the current behavior. This requires some discussion about how they would like these changes to be made as it does require more complexity to be added to the connection logic. There is also some open questions like would this be a opt-in config option or whatever a plugin can implicitly request this per connection. If this can be opted in implicitly do we want new events or how would plugin request this. |
I looked more closely at the code, and I'd have something to say about this:
As you can see here the If we assume what I'm saying is correct, then moving I also wanted to point out that |
The issue is that there is this small gab where the check can be passed while there is active connection. Think about a scenario where the player connects and the LoginEvent is still being processed, disconnects and then connects again. If the LoginEvent is being still processed due to timing issues then now you have two LoginEvents in flight for the same player. |
Oh, you're right. Then that has to be adjusted too. Maybe the idea of having a duplicate register connection check was to avoid unnecessary calls later. I think these two inconsistencies are a concept that needs to be fixed within Velocity itself rather than handled by a plugin; in my opinion, it is not a concurrency issue that the plugin has to handle. Being in a multi-threaded environment is no excuse for not having an eye for events call logic. I am also in the same situation as you where I need to load and unload player data. Have you considered using the |
That still doesn't fully solve the possible race with disconnect events clearing the data afterwards for a new connections. I stuck with my solution that I described on #1013 (comment) (the last code block) which just prevents new logins in case there is a race as that should be a rare case anyways. One could add a lock that's released on disconnect so you can just wait for the previous connection to be released and then process if you would like to avoid kicking the user. I also have internal events to sync other plugins in between the login and disconnect events. Yes, its a mess and yes I would prefer not to do this but at least it solves all the problems. |
I don't see any issues in applying the Disconnect fix patch you mentioned, by moving the
In this way:
Let me know if I'm wrong 😄 |
The side effect would be that players could receive a message that states that "They are already logged in" which was one of asteis concerns. That could be worked around by checking if the connection is being teared down and just wait for that to finish but this might soft-lock if some plugins misbehaves and never lets the event to complete so that's one downside but I don't think that's a big concern. Also there would be a behavior change in #getPlayer, #getAllPlayers and #getPlayerCount which can be worked around if that's necessary to keep as is. But yes, the behavior would be correct with the disconnect fix patch for this scenario. Its not really perfect but its better than nothing. |
I'm raising this issue to get clarification on what the concurrency guarantees are supposed to be and showing the possible problems you could face with the current behavior due to the current implementations possible racy behavior. I'm focusing here for the cases where the
kick-existing-players
has been turned off as that is the default behavior. However turning that on has multiple additional quirks that could be discussed.Currently we are unregistering the connection on teardown even before the
DisconnectEvent
has been fired as can be seen here. Technically it is now possible for the player to login and trigger events related to post Mojang authorization likePermissionsSetupEvent
orLoginEvent
. This could lead to situations where we initialize some data on login event but the disconnect event ends up incorrectly clearing them. The DisconnectEvent event seems to have note about that "Velocity typically fires this event asynchronously and does not wait for a response." and the current behavior seems somewhat intended but in my opinion this doesn't make sense for post Mojang authorization.The solution is quite simple, move the connection unregistration after the event has fired. But this also has some implications like what if some bad behaving plugin doesn't complete the event? This would block all further logins from the player until the proxy would be restarted but this argument seems moot to me as correctness should be preferred. Another argument could be made that this is very unlikely due to the high delay between user actions. Dealing with this in a plugin would require comparing the actual
Player
references which is very inconvenient.Fixing just the disconnection isn't enough as the connection registration is done after
LoginEvent
as can be seen here. This could result in somewhat similar race where the player logins, cancels it and then reconnects immediately. This is also somewhat unlikely due to the default login rate limiting of 3s.The solution here is to move it to when the
AuthSessionHandler
gets activated. This also has some implications like thegetPlayer
API would return the Player much earlier and plugins might assume all the state has been already initialized at this point which is no longer true. This could be fixed by adding additional conditions where we would not return the player even if its present in the collection. Additionally previously cancelling theLoginEvent
whenkick-existing-players
is set to true would not kick the currently active session but now it would.Well that was a lot! I might have missed something but hopefully that sums up the current state for at least when the
kick-existing-players
is turned off..The text was updated successfully, but these errors were encountered: