New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enqueue slower than without #1118
Comments
Thanks for this insightful benchmark, @nicholas-a-guerra. Firstly, I've implemented some minor adjustments to your script in order to improve its accuracy:
from time import perf_counter
from loguru import logger
import sys
def main():
base_loguru_times = base_loguru()
enqueue_loguru_times = enqueue_loguru()
print_analysis("Base Loguru", base_loguru_times)
print_analysis("Enqueue Loguru", enqueue_loguru_times)
def base_loguru():
logger.remove()
logger.add(sys.stderr, enqueue=False)
logger.warning("Init")
times = []
for _ in range(1000):
start = perf_counter()
for _ in range(100):
logger.warning("Test")
times.append(perf_counter()-start)
return times
def enqueue_loguru():
logger.remove()
logger.add(sys.stderr, enqueue=True)
logger.warning("Init")
times = []
for _ in range(1000):
start = perf_counter()
for _ in range(100):
logger.warning("Test")
times.append(perf_counter()-start)
logger.complete()
return times
def print_analysis(test_name, times):
min_time = min(times)
max_time = max(times)
avg_time = sum(times) / len(times)
print('-'*50)
print(test_name)
print(f"Min: {min_time:.4f} | Max: {max_time:.4f} | Avg: {avg_time:.4f}")
if __name__ == "__main__":
main() With this script, I get these results on my machine (using $ python bench.py 2>/dev/null
--------------------------------------------------
Base Loguru
Min: 0.0036 | Max: 0.0054 | Avg: 0.0038
--------------------------------------------------
Enqueue Loguru
Min: 0.0083 | Max: 0.0167 | Avg: 0.0098 That's pretty bad. Using Why that? The main reason is that That being said, I recently refactored the python bench.py 2>/dev/null
--------------------------------------------------
Base Loguru
Min: 0.0037 | Max: 0.0059 | Avg: 0.0038
--------------------------------------------------
Enqueue Loguru
Min: 0.0037 | Max: 0.0719 | Avg: 0.0045 Much better, isn't it? But still slower than the base with This time, a second player comes into play: the GIL. You see, Loguru implements The consequence of the above is that two tasks running in two threads will always be slower than if they were executed in a single thread. The overhead we're observing here when As of today, it's not possible to entirely remove the GIL although although there have been very recent changes in that direction. That's why we shouldn't assume that |
Thanks for the immediate and very thorough feedback! All of what you shared clears up several aspects. Especially with the adjustment made to the queue usage, those new benchmark numbers look much more expected. I've unfortunately had to go down the rabbit hole of logging performance due to some time sensitive code. But as everyone's already aware, Python is probably the first problem when trying to speed things up, and not third party libraries. But thought it was an interesting benchmark nonetheless. It seems in my case there may never be a perfect solution. Perhaps I could even just write to stdout, pipe to a file, and then parse the file in another process. Oh the joy of no true parallelism in python... Regardless, that clears up the details on my end. I appreciate the help! |
The best you can do is wait for GIL to be deactivated, or for the critical parts of Loguru to be re-implemented in C. Not sure which will come first... 😬 |
I understand this library is not focused on performance but I was just curious if I'm misunderstanding how enqueue works. Based on my understanding it essentially just puts the log into a queue and either handles it in another thread or async task (maybe?). Then this should minimize block time for the
logger.info
call or similar. However, interestingly I'm seeing some odd results when benchmarking. Here's a script I'm running:When running this, I am consistently seeing results like this:
You can see that oddly enough enqueue is taking longer than without enqueue. I guess I'm trying to understand what I might be missing or if this is a bug. My first thought is maybe there's some overhead for the enqueue up front but if the handler is very io intense then maybe there will still be a net gain. But then I thought, why would there be overhead for a queue. Surely, no matter what adding to a queue should be faster than even printing to stdout right?
The text was updated successfully, but these errors were encountered: