Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ColyseusRoom instance occassionally becoming null #164

Open
tonygiang opened this issue Jun 30, 2021 · 3 comments
Open

ColyseusRoom instance occassionally becoming null #164

tonygiang opened this issue Jun 30, 2021 · 3 comments

Comments

@tonygiang
Copy link

tonygiang commented Jun 30, 2021

I have just witnessed with my eyes a situation where the ColyseusRoom instance returned by ColyseusClient.JoinById task becomes null at some point after the task finishes running. Prior to this null check, the ColyseusRoom instance was able to run a series of ColyseusRoom.OnMessage calls without exception. This doesn't always happen and cannot be consistently reproduced, thus I suspect a race condition or a similar circumstance.

To be more specific: We saved the ColyseusRoom instance in a ScriptableObject asset that is referenced across multiple scenes. It was created when Scene A is running, then it receives 2 messages from the server, then Scene B is loaded, then a script in Scene B runs that null-checks this ColyseusRoom instance and that's when we found that it was null.

Additional information: It happened again, but this time, after the null-check we tried to await UniTask.WaitUntil(() => roomInstance != null) and it never runs past that. UniTask is a class from this package.

@endel
Copy link
Member

endel commented Jun 30, 2021

Hi @tonygiang, thanks for letting us know! - Have you experienced this in Unity editor, native build, or WebGL?

UniTask can be a clue here. I personally never used it along with Colyseus, but they should be compatible.

@tonygiang
Copy link
Author

Hi @tonygiang, thanks for letting us know! - Have you experienced this in Unity editor, native build, or WebGL?

UniTask can be a clue here. I personally never used it along with Colyseus, but they should be compatible.

This issue has happened with both Unity Editor and native build. We do not target WebGL. This issue happens regardless of whether there is UniTask or not. We detected a null room instance before we even put in a UniTask to await until it stops being null.

@tonygiang
Copy link
Author

Upon diving into the Unity client source code, there's one part that concerns me. This is the full source of ColyseusClient.ConsumeSeatReservation:

        public async Task<ColyseusRoom<T>> ConsumeSeatReservation<T>(ColyseusMatchMakeResponse response,
            Dictionary<string, string> headers = null)
        {
            ColyseusRoom<T> room = new ColyseusRoom<T>(response.room.name)
            {
                Id = response.room.roomId,
                SessionId = response.sessionId
            };

            Dictionary<string, object> queryString = new Dictionary<string, object>();
            queryString.Add("sessionId", room.SessionId);

            room.SetConnection(CreateConnection(response.room.processId + "/" + room.Id, queryString, headers));

            TaskCompletionSource<ColyseusRoom<T>> tcs = new TaskCompletionSource<ColyseusRoom<T>>();

            void OnError(int code, string message)
            {
                room.OnError -= OnError;
                tcs.SetException(new CSAMatchMakeException(code, message));
            }

            void OnJoin()
            {
                room.OnError -= OnError;
                tcs.TrySetResult(room);
            }

            room.OnError += OnError;
            room.OnJoin += OnJoin;

            onAddRoom?.Invoke(room);

#pragma warning disable 4014
            room.Connect();
#pragma warning restore 4014

            return await tcs.Task;
        }

There is something that bothers me (and I'm not very sure about this since this is a bit more low-level than I'm comfortable with, but I still have to throw this theory out here).

So the TaskCompletionSource here supposely will finish its associated Task the first time it has its result set. Makes sense. This happens with the OnJoin callback, which in turn is invoked when the room first receives a message containing ColyseusProtocol.JOIN_ROOM. Now, the part that concerns me is the internal method passed into OnJoin is not self-removing (it only removes OnError) and in the case where a ColyseusProtocol.JOIN_ROOM message is sent twice for the same room (for whatever reason - could be a series of bytes that just happens to match ColyseusProtocol.JOIN_ROOM), what would happen when it runs tcs.TrySetResult(room) again?

Will the internal OnJoin method even exist after ConsumeSeatReservation has ran its course the first time? (can't suppress my C++ instinct on this one)
Could the room reference be disposed and become null the second time OnJoin is called?
Could the null value propagate to the Result property of the tcs.Task when this happen?

I'm just spitballing every possible scenario here. It's so hard to think of another reason why the result of ColyseusClient.JoinById could turn null on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants