Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in handling plugins packets #1665

Open
Spasitjel opened this issue Feb 4, 2024 · 3 comments · May be fixed by #1673
Open

Race condition in handling plugins packets #1665

Spasitjel opened this issue Feb 4, 2024 · 3 comments · May be fixed by #1673

Comments

@Spasitjel
Copy link

I first noticed this with ClassicUO 1.0.0.0, I was developing an autolooter with Razor Enhanced 0.8.2, and for debugging reasons I was printing item properties both in System messages and Overhead messages (with basically no pause).

With that script it seems I can cause several different issues in ClassicUO:

  1. 1-2 Messages are printed, then ClassicUO crashes due to an error in creating a Texture2D, basically the same error as: Texture2D creation failed! Error Code: Par�metro incorreto. (0x80070057) #1616

I looked a bit into this, I think the logic in ClassicUO is missing some safety checks to avoid the crash, but in theory the safety check would have to result in a disconnect anyway (because I think the packet is being read from corrupted buffer).

  1. Few messages are printed, some are corrupted (random Japanese characters) and then a crash. This one was similar to: System.ArgumentException: Destination is too short #1618, but it was actually inside SetCapacity. It was trying to copy the old buffer over the new, but with a Span of size bigger than the old buffer.

  2. Corrupted characters, the script end up "working" (it moves some selected items from a container to another), but then looking at the console, it's spamming "need more data" for some random but fixed packet ID.

After the whole day behind this I've been able to catch ClassicUO in this state:
image

While it cannot be seen for the CUO_MAIN_THREAD which instance of a circular buffer is being Dequeued from, I can see from the debugger that it's using _pluginsBuffer, which is also being used by that other thread, which is Enqueueing data.

While there's this lock in OnPluginRecv:

lock (PacketHandlers.Handler)
{
PacketHandlers.Handler.Append(data.AsSpan(0, length), true);
}

There is no lock in Update:

var packetsCount = PacketHandlers.Handler.ParsePackets(UO.World, data);

Resolution

I'm still not sure yet, to be fair I just recently digged in to the code; slapping a lock in GameController::Update might work, but I see that there are other accesses to PacketHandler.Handler that are not protected, and I wonder if it's a bigger problem of having to restructure the flow of how handlers are accessed, because it's not always enough to lock the resource accessed concurrently; there might be other surrounding code that has to behave atomically.

Furthermore ParsePackets(CircularBuffer stream, bool allowPlugins) locks the CircularBuffer itself, but I'm not sure where else is a concurrent access that it's using or might use that lock to properly synchronize access.

@Spasitjel
Copy link
Author

By the way just for test, I just tried to put the lock in the game controller:

            int packetsCount;
            lock(PacketHandlers.Handler) {
                packetsCount = PacketHandlers.Handler.ParsePackets(data);
            }
            NetClient.Socket.Statistics.TotalPacketsReceived += (uint)packetsCount;
            NetClient.Socket.Flush();

And now it doesn't misbehave anymore.

@Spasitjel
Copy link
Author

Spasitjel commented Feb 19, 2024

Unfortunately while the change above definitely improves the stability of using the client with an assistant, and especially using scripts, I just stumbled upon a new issue, a deadlock, that happens when this lock is added.

I'm not sure right now if it's something that the plugin does wrong or the client, but basically when in Razor Enhanced you configure 2 profiles and link them each to their character, if you start the client, and before login, in Razor Enhanced the profile selected is the wrong one, when you login and select in the client the other character, then you get a deadlock.

These are the call stacks (a Worker Thread):

[Waiting on lock owned by Thread 26036, double-click or press enter to switch to thread]
ClassicUO.exe!ClassicUO.Network.Plugin.OnPluginRecv(ref byte[] data, ref int length) Line 755
	at D:\Development\ClassicUO\src\ClassicUO.Client\Network\Plugin.cs(755)
RazorEnhanced.exe!Assistant.ClassicUOClient.SendToClient(Assistant.Packet p) Line 524
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Client\ClassicUO.cs(524)
RazorEnhanced.exe!Assistant.Filters.LightFilter.OnDisable() Line 63
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Filters\Light.cs(63)
RazorEnhanced.exe!Assistant.Filters.Filter.DisableAll() Line 54
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Filters\Filter.cs(54)
RazorEnhanced.exe!RazorEnhanced.Profiles.ProfileChange(string name) Line 421
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\RazorEnhanced\Profiles.cs(421)
RazorEnhanced.exe!Assistant.PacketHandlers.LoginConfirm.AnonymousMethod__1(Assistant.MainForm s) Line 1182
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Network\Handlers.cs(1182)
[External Code]
RazorEnhanced.exe!Assistant.MainForm.WndProc(ref System.Windows.Forms.Message msg) Line 9440
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\UI\Razor.cs(9440)
[External Code]
RazorEnhanced.exe!Assistant.ClassicUOClient.RunTheUI() Line 276
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Client\ClassicUO.cs(276)
RazorEnhanced.exe!Assistant.ClassicUOClient.RunUI.AnonymousMethod__63_0() Line 280
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Client\ClassicUO.cs(280)
[External Code]

Main Thread:

[External Code]
RazorEnhanced.exe!Assistant.UI.Ext.SafeAction<Assistant.MainForm>(Assistant.MainForm control, System.Action<Assistant.MainForm> action) Line 16
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\UI\Ext.cs(16)
RazorEnhanced.exe!Assistant.PacketHandlers.LoginConfirm(Assistant.PacketReader p, Assistant.PacketHandlerEventArgs args) Line 1182
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Network\Handlers.cs(1182)
RazorEnhanced.exe!Assistant.PacketHandler.ProcessViewers(System.Collections.Generic.List<Assistant.PacketViewerCallback> list, Assistant.PacketReader p) Line 213
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Network\PacketHandler.cs(213)
RazorEnhanced.exe!Assistant.PacketHandler.OnServerPacket(int id, Assistant.PacketReader pr, Assistant.Packet p) Line 133
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Network\PacketHandler.cs(133)
RazorEnhanced.exe!Assistant.ClassicUOClient.OnRecv(ref byte[] data, ref int length) Line 308
	at D:\a\RazorEnhanced\RazorEnhanced\Razor\Client\ClassicUO.cs(308)
ClassicUO.exe!ClassicUO.Network.Plugin.ProcessRecvPacket(byte[] data, ref int length) Line 555
	at D:\Development\ClassicUO\src\ClassicUO.Client\Network\Plugin.cs(555)
ClassicUO.exe!ClassicUO.Network.PacketHandlers.ParsePackets(ClassicUO.Network.CircularBuffer stream, bool allowPlugins) Line 143
	at D:\Development\ClassicUO\src\ClassicUO.Client\Network\PacketHandlers.cs(143)
ClassicUO.exe!ClassicUO.Network.PacketHandlers.ParsePackets(System.Span<byte> data) Line 91
	at D:\Development\ClassicUO\src\ClassicUO.Client\Network\PacketHandlers.cs(91)
ClassicUO.exe!ClassicUO.GameController.Update(Microsoft.Xna.Framework.GameTime gameTime) Line 442
	at D:\Development\ClassicUO\src\ClassicUO.Client\GameController.cs(442)
FNA.dll!Microsoft.Xna.Framework.Game.Tick() Line 531
	at D:\Development\ClassicUO\external\FNA\src\Game.cs(531)
FNA.dll!Microsoft.Xna.Framework.Game.RunLoop() Line 864
	at D:\Development\ClassicUO\external\FNA\src\Game.cs(864)
FNA.dll!Microsoft.Xna.Framework.Game.Run() Line 414
	at D:\Development\ClassicUO\external\FNA\src\Game.cs(414)

@Spasitjel
Copy link
Author

Spasitjel commented Feb 19, 2024

I should've seen this, but locking Handler is too wide of a lock, but especially it's the same lock which will be used when parsing both the Server->Client buffer and the Plugin->Client buffer.
What's happening is that while parsing a packet in the Server->Client, we also lock the Plugin->Client buffer basically. Furthermore this packet is being handled by the plugin and causes the plugin to attempt to enqueue a packet to the Plugin->Client buffer, through the same lock (on Handler), which is already held by the client.

This means that in general locking Handler it's incorrect when calling ParsePackets or Append. The correct fix should be this instead:

diff --git a/src/ClassicUO.Client/Network/PacketHandlers.cs b/src/ClassicUO.Client/Network/PacketHandlers.cs
index 59daf7b34..373718b32 100644
--- a/src/ClassicUO.Client/Network/PacketHandlers.cs
+++ b/src/ClassicUO.Client/Network/PacketHandlers.cs
@@ -157,7 +157,20 @@ namespace ClassicUO.Network
             if (data.IsEmpty)
                 return;

-            (fromPlugins ? _pluginsBuffer : _buffer).Enqueue(data);
+            if (fromPlugins)
+            {
+                lock (_pluginsBuffer)
+                {
+                    _pluginsBuffer.Enqueue(data);
+                }
+            }
+            else
+            {
+                lock (_buffer)
+                {
+                    _buffer.Enqueue(data);
+                }
+            }
         }

         private void AnalyzePacket(ReadOnlySpan<byte> data, int offset)
@@ -3168,7 +3181,8 @@ namespace ClassicUO.Network
                     facet = p.ReadUInt16BE();
                 }

-                multiMapInfo = Client.Game.MultiMaps.GetMap(facet, width, height, startX, startY, endX, endY);            }
+                multiMapInfo = Client.Game.MultiMaps.GetMap(facet, width, height, startX, startY, endX, endY);
+            }
             else
             {
                 multiMapInfo = Client.Game.MultiMaps.GetMap(null, width, height, startX, startY, endX, endY);
diff --git a/src/ClassicUO.Client/Network/Plugin.cs b/src/ClassicUO.Client/Network/Plugin.cs
index c5a715d2a..c98166be0 100644
--- a/src/ClassicUO.Client/Network/Plugin.cs
+++ b/src/ClassicUO.Client/Network/Plugin.cs
@@ -752,10 +752,7 @@ namespace ClassicUO.Network

         private static bool OnPluginRecv(ref byte[] data, ref int length)
         {
-            lock (PacketHandlers.Handler)
-            {
-                PacketHandlers.Handler.Append(data.AsSpan(0, length), true);
-            }
+            PacketHandlers.Handler.Append(data.AsSpan(0, length), true);

             return true;
         }

I'll test this a bit more again, sorry for the noise.

@Spasitjel Spasitjel linked a pull request Mar 19, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant