Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed freezing issue in XAsyncSockets lib and updated thread locking #26

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

ElHyperion
Copy link

@ElHyperion ElHyperion commented Feb 25, 2020

I started this pull request to fix an issue I got:
I have a server running with embedded config on an ESP32 which often froze on socket.SendTextMessage(message) which I'm calling once every second with only an 81 byte-long message. I traced down the issue all the way to _socketListRemove where it sometimes hangs forever on socketsList.remove(socket). Since it never returns from this command and stays locked when _socketListAdd wants to use it already, the sockets stay frozen forever.

Returning False if the _opLock is locked fixed the problem for me but it's rather a quick workaround with which I just wanted to point out the issue and discuss the solutions. Before returning False, I tried printing len(socketsList) which always returned 0 on me, so I guess the .remove got processed anyways.

I also remade thread locking using the 'with' statement. Tested and it works the same, but is more elegant (and reliable?)

@jczic
Copy link
Owner

jczic commented Feb 26, 2020

Hello @ElHyperion,
I'm ok to using with instead of acquire and release.
But I don't understand why _socketListRemove can stay locked. It's very strange?
Could you the exact instruction that causes this?
socket.fileno() or the _opLock for another reason?
Thank you!

@ElHyperion
Copy link
Author

ElHyperion commented Feb 26, 2020

After more thorough testing, I think the problem appears with the socket lock I use right before sending text messages. Any recommendations on how to change it? I had it without _socket_lock at first, with the same freezing issue, and thought that adding the lock to it would fix the problem.

UPDATE: Is this two-way communication too much for a single socket, and should I use two sockets instead? Or is the issue somewhere else?

This is XAsyncSockets with added prints:

def _socketListAdd(self, socket, socketsList) :
     if self._opLock.locked() :
         print('Add: cannot lock!', socket.fileno(), len(socketsList))
     with self._opLock :
         print('Add: locked', socket.fileno(), len(socketsList))
         ok = (socket.fileno() in self._asyncSockets and socket not in socketsList)
         if ok :
             print('Add: if ok', socket.fileno(), len(socketsList))
             socketsList.append(socket)
             print('Add: appended', socket.fileno(), len(socketsList))
     print('Add: unlocked')
     return ok

def _socketListRemove(self, socket, socketsList) :
     with self._opLock :
         print('Remove: locked', socket.fileno(), len(socketsList))
         ok = (socket.fileno() in self._asyncSockets and socket in socketsList)
         if ok :
             print('Remove: if ok', socket.fileno(), len(socketsList))
             socketsList.remove(socket)
             print('Remove: removed', socket.fileno(), len(socketsList))
     print('Remove: unlocked')
     return ok

This is the code I use for sending text messages over sockets:

def send_telemetry(data):
    message = '{0:d}{1}'.format(_data_complete, data)
    print('Broadcasting %d B' % len(data))
    with _socket_lock:
        print('Socket lock')
        for s in _sockets:  # Only one socket for now
            s.SendTextMessage(message)
        print('Sent messages')

Remove: locked 60 2
Remove: if ok 60 2
Remove: removed 60 1
Remove: unlocked
Add: locked 60 1
Add: if ok 60 1
Add: appended 60 2
Add: unlocked
Remove: locked 60 6
Broadcasting 81 B
Socket lock ------- Problem here?
Add: cannot lock! 60 0

@jczic
Copy link
Owner

jczic commented Feb 28, 2020

I'm checking all but I don't understand how this _opLock can stay locked 😞
_socketListAdd and _socketListRemove are very called often for I/Os manipulation and this _opLock is really mandatory here.
Doesn't the freezer come with an inter-thread lock?
Do you call mws2.StartManaged() with parllProcCount greater than 1?
Otherwise, could you try with parllProcCount=5 for example?

@ElHyperion
Copy link
Author

ElHyperion commented Feb 29, 2020

I start it with no arguments. Adding parllProcCount=3 and greater results in memory allocation error (I can get only up to 64kB free before starting the server). I tried parllProcCount=2 but the issue continued appearing, only made the website download twice as slow (about 40s instead of 20s). parllProcCount=1 worked fine but did not fix the locking issue either.

Setting parllProcCount=0 seems to fix it but I cannot do that anymore, it just hangs on the StartManaged command forever in this case. I could start it this way before, while testing the proc count, under some unknown circumstances, but I cannot recreate them anymore. I'm not sure if it's a problem of memory though because starting it through REPL and hitting Ctrl+C after the command freezes I get around 44.8kB on gc.mem_free(). The server then appears to run but does not respond to any requests (visiting the website is stuck on loading).

I'll be testing it on a WROVER kit with 8MB SRAM soon, if that helps.

@jczic
Copy link
Owner

jczic commented Mar 4, 2020

Ok.
parllProcCount=1 means that the I/Os events loop of the server only waits once in a dedicated thread and parllProcCount=0, waits in main thread (and main thread is locked until the server is stopped).
So, in both cases, your web site must be works.
But for the moment, I still don't understand your locking problem in XAsyncSockets lib and I would like see your code if it's possible.
Do you can send me it by mail to jczic.bos@gmail.com please?
I'd be really curious to more investigate and understand that.

@kenjican
Copy link

I encountered same situation -- thread is locked. My solution is as below:
1: comment out all the self._opLock.acquire() and self._opLock.release() in XAsyncScokets.py
2: wms.StartManaged(parllProcCount=0)
That is only one main thread and without thead lock

it works.

@jczic
Copy link
Owner

jczic commented Feb 18, 2024

The asynchronous concurrent thread system has just been redesigned. It now uses a battery of workers, so there should be no more problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants