Logging lock not released when held by OS threads #2013

jaklaassen-affirm · 2024-01-22T18:04:26Z

gevent version: 23.9.1, installed from PyPi via Poetry
Python version: cPython 3.9.18 installed via pyenv
Operating System: OSX Sonoma 14.2

Description:

In a monkey-patched environment, I'm seeing that when the main thread attempts to acquire a logging lock held by an OS thread, it will wait long after the OS thread releases the lock. I have observed this waiting up to 5 hours before successfully acquiring the lock.

I believe this is a complete repro case:

import logging
import time
import gevent
import gevent.monkey

gevent.monkey.patch_all()

pool = gevent.get_hub().threadpool

class PausingHandler(logging.StreamHandler):
    def emit(self, record):
        time.sleep(1)
        return super(PausingHandler, self).emit(record)
    
logging.basicConfig(level=logging.INFO, handlers=[PausingHandler()], format="%(message)s")

logger = logging.getLogger("gevent_test")
logger.info("STARTING")

def log_async(msg):
    while True:
        logger.info(msg)
        time.sleep(1)

def main():
    thread = pool.apply_async(log_async, ["IN_THREAD"])
    greenlet = gevent.spawn(log_async, "IN_GREENLET")
    
    thread.join()
    
if __name__ == "__main__":
    main()

Which produces this output:

IN_GREENLET
IN_THREAD
IN_THREAD
IN_THREAD
...

Following the suggestion in https://www.gevent.org/changelog.html#id30, I added gevent.get_hub() to the top of log_async, which seems to have fixed the problem.

Thanks for all your hard work on gevent! Its an amazing achievement.

The text was updated successfully, but these errors were encountered:

jamadden · 2024-02-13T21:52:47Z

Sorry, I don't have any better solution than that (gevent can't do that itself because of the way locks are used in the interpreter; and besides, multiple hubs are generally not a good thing). I do see a possible improvement in changing from a spinlock to native timeouts (deep in the internals) now that all supported Python versions can do that, but I doubt that would make much of a difference.

Unless otherwise documented, most lock implementations aren't guaranteed to be fair. That holds true for gevent locks , and they're even less fair across threads.

Mixing threads and greenlets and locks is a very very delicate, tricky thing. If at all possible, only do computation in a thread, don't try to interact with shared gevent objects (such as locks) from two different threads. Honestly, IIRC, the fact that I was able to get any of that working at all was something of a minor miracle.

…pinning. This may improve scenarios like #2013

jamadden · 2024-02-14T12:15:50Z

Hmm. Interestingly, disabling most of gevent's C extensions suddenly makes the threads ping-pong back and forth. Off the top of my head, I'm not sure why that would be.

$ PURE_PYTHON=1 python /tmp/example.py
STARTING
IN_GREENLET
IN_THREAD
IN_GREENLET
IN_THREAD
IN_GREENLET
IN_THREAD
IN_GREENLET
IN_THREAD
IN_GREENLET
IN_THREAD

jaklaassen-affirm · 2024-03-02T08:42:43Z

Thanks for the response and sorry to go silent, been kinda slammed. Our specific use case is with Confluent's Kafka library, which starts threads from C that call back into Python callbacks for logging. I should really look into whether they can just call those callbacks from the main thread. That would help Gevent users, of which there must be many.

This gave me a chance to dig into the Gevent internals, which was at least fun for me. I see the pain and the problem.

I made a simpler repro case that also repros with PURE_PYTHON=1:

import threading
import gevent
import gevent.monkey
gevent.monkey.patch_all()

hub = gevent.get_hub()
pool = hub.threadpool

def thread_fn(evt, lock):
    print("acquire in thread")
    lock.acquire()
    print("evt.wait() in thread")
    while not evt.is_set():
        gevent.sleep(1)
    evt.wait()
    print("lock.release() in thread")
    lock.release()

def greentlet_fn(evt, lock):
    res = lock.acquire()
    print(f"lock.acquire() in greenlet = {res}")    

def main():
    evt = threading.Event()
    lock = threading.Lock()
    
    thread = pool.apply_async(thread_fn, [evt, lock])
    gevent.sleep(1)
    print("Starting greenlet")
    greenlet = gevent.spawn(greentlet_fn, evt, lock)
    gevent.sleep(1)
    print("Setting event")
    evt.set()
    thread.join()
    gevent.wait([greenlet], timeout=5)
    print(f"greenlet is ready={greenlet.ready()}")
    
if __name__ == "__main__":
    main()

Adding

lock.acquire()
lock.release()

before gevent.wait([greenlet], timeout=5) makes the greenlet finish, but reading the code its obvious why, because releasing the lock on the main thread notifies the links on the main thread. We could not switch to those when we notified from the thread.

I do wonder if Semaphore._handle_unswitched_notifications could be written to do what AbstractLinkable._handle_unswitched_notifications does, but rather than calling hub.loop.run_callback_threadsafe(link, self), call hub.loop.run_callback_threadsafe(AbstractLinkable._check_and_notify, self). I can see the potential for an infinite loop there, which might be bad, and of course the possibility of all sorts of problems that I can't see.

Honestly, IIRC, the fact that I was able to get any of that working at all was something of a minor miracle.

Now that I've really read the code, I completely and totally agree 😄

jamadden added a commit that referenced this issue Feb 13, 2024

Make cross-thread locks use native blocking and timeouts instead of s…

e0d9705

…pinning. This may improve scenarios like #2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logging lock not released when held by OS threads #2013

Logging lock not released when held by OS threads #2013

jaklaassen-affirm commented Jan 22, 2024 •

edited

jamadden commented Feb 13, 2024

jamadden commented Feb 14, 2024

jaklaassen-affirm commented Mar 2, 2024 •

edited by jamadden

Logging lock not released when held by OS threads #2013

Logging lock not released when held by OS threads #2013

Comments

jaklaassen-affirm commented Jan 22, 2024 • edited

Description:

jamadden commented Feb 13, 2024

jamadden commented Feb 14, 2024

jaklaassen-affirm commented Mar 2, 2024 • edited by jamadden

jaklaassen-affirm commented Jan 22, 2024 •

edited

jaklaassen-affirm commented Mar 2, 2024 •

edited by jamadden