Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much slower response time with python 3.7.17 compared to 2.7.18 #884

Open
eduardo-boutiquenumerique opened this issue Apr 24, 2024 · 11 comments

Comments

@eduardo-boutiquenumerique

Hi,

I have a couple of sites that currently run under Django 1.11.29 in python 2.7.18 using apache and mod_wsgi. This works great !

I have one specific page with many ajax calls (based on user interaction) and have less then 1 second response time under this configuration.

Someone in my team wanted to switch to nginx+uwsgi, so I have tested nginx+uwsgi with python 2.7.18 and the response times for those calls were over 2 seconds. I don't know why moving from apache+mod_wsgi to nginx+uwsgi, but I obviously kept apache+mod_wsgi.

Now, I have ported the code to python 3.7.17, since using django 1.11.29 and my first, obvious choice, was apache+mod_wsgi. It happens that with python 3.7.17 my response times are over 3 seconds !

Testing nginx+uwsgi in python 3, I have the same response times that I had using python 2.7 (2 seconds), which is much worse the apache+mod_wsgi in python 2, but much better than apache+mod_wsgi in python 3.7.

I'm reaching out to you since I'm at a loss and I don't know where else I can look for help. I'm using apache in daemon mode with virutalenv (you have already helped me once) and have compiled mod_wsgi according to this doc

I would prefer to keep apache+mod_wsgi, but I can't afford to multiple my response time by 3 (and sometimes 4).

Do you know or have an idea could be the causing this slowdown in performance ?

Thanks in advance

@GrahamDumpleton
Copy link
Owner

There can be so many reasons why you might not see things as being the same, especially if was being done on a different machine and not the same original machine.

It would help me be able to comment if I can see what the original mod_wsgi configuration used was with Python 2.7 and then what now using with Python 3.7.

Also, what Apache MPM is being used and how the MPM configuration in Apache is setup? Plus how is keep alive connections configured in Apache.

@eduardo-boutiquenumerique
Copy link
Author

Hi Graham,

Thanks a lot for your response. both servers are linux (ubuntu), but not the same versions. Both are using apache 2.4.X versions and I haven't touched the default values. When I run apache2 -V, both lines "Server MPM:" are empty, but when I do a "ls /etc/apache2/mods-enabled/ | grep mpm", I have npm_event.conf and npm_event.load that are returned, so I'm posting the config to those files.

I'm pasting here all the info you asked for :

Python 2:

Ubuntu 20.04.6 LTS

Server version: Apache/2.4.41 (Ubuntu)
Server built:   2024-04-10T17:46:26
Server's Module Magic Number: 20120211:88
Server loaded:  APR 1.6.5, APR-UTIL 1.6.1
Compiled using: APR 1.6.5, APR-UTIL 1.6.1
Architecture:   64-bit
Server MPM:     
Server compiled with....
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=256
 -D HTTPD_ROOT="/etc/apache2"
 -D SUEXEC_BIN="/usr/lib/apache2/suexec"
 -D DEFAULT_PIDLOG="/var/run/apache2.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="mime.types"
 -D SERVER_CONFIG_FILE="apache2.conf"

KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5

<IfModule mpm_event_module>
        StartServers                     2
        MinSpareThreads          25
        MaxSpareThreads          75
        ThreadLimit                      64
        ThreadsPerChild          25
        MaxRequestWorkers         150
        MaxConnectionsPerChild   0
</IfModule>

Python 3

Ubuntu 22.04.4 LTS

Server version: Apache/2.4.52 (Ubuntu)
Server built:   2024-04-10T17:45:18
Server's Module Magic Number: 20120211:126
Server loaded:  APR 1.7.0, APR-UTIL 1.6.1
Compiled using: APR 1.7.0, APR-UTIL 1.6.1
Architecture:   64-bit
Server MPM:     
Server compiled with....
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_PROC_PTHREAD_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=256
 -D HTTPD_ROOT="/etc/apache2"
 -D SUEXEC_BIN="/usr/lib/apache2/suexec"
 -D DEFAULT_PIDLOG="/var/run/apache2.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="mime.types"
 -D SERVER_CONFIG_FILE="apache2.conf"

KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5

<IfModule mpm_event_module>
        StartServers                     2
        MinSpareThreads          25
        MaxSpareThreads          75
        ThreadLimit                      64
        ThreadsPerChild          25
        MaxRequestWorkers         150
        MaxConnectionsPerChild   0
</IfModule>

Everything seems the same to me.

When you ask about mod_wsgi configuration, I think you are talking about the config in the virtual_host sections for both.
So here you go :

Python 2 :

WSGIDaemonProcess project_name python-path=/home/ubuntu/workspace/project_name/:/home/ubuntu/.pyenv/versions/project_name/lib/python2.7/site-packages user=ubuntu processes=3 threads=1
    WSGIProcessGroup project_name
    WSGIScriptAlias / /home/ubuntu/workspace/project_name/project_name/project/wsgi.py
    WSGIApplicationGroup %{GLOBAL}

Python 3 :

WSGIDaemonProcess project_name python-path=/home/ubuntu/workspace/project_name/:/home/ubuntu/.pyenv/versions/project_name-3-7-17/lib/python3.7/site-packages user=ubuntu processes=3 threads=1
    WSGIProcessGroup project_name
    WSGIScriptAlias / /home/ubuntu/workspace/project_name/project_name/project/wsgi.py
    WSGIApplicationGroup %{GLOBAL}

Still everything seems the same to me.

One difference is that when I installed mod_wsgi on python'2 server, I installed using sudo apt install libapache2-mod-wsgi commande while when installing it within python 3, I used the procedure described here using mod_wsgi 5.0.0 release.

One thing that I noted in my procedure is that I had to manually create the file /etc/apache2/mods-available/wsgi.load with the line LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so inside it. I then added the line WSGIRestrictEmbedded On after your suggestion because my errors were not showing up correctly as you explained here.. This was a game changer !

Also, I noticed that in the python 2 server, I have a file /etc/apache2/mods-available/wsgi.conf which have the lines

<IfModule mod_wsgi.c>
    # all commented configs here
</IfModule>

But this file was missing in python 3's server. I've just added it, re-enabled the module, restarted apache and re-tested, but the results are the same (as expected).

Also I tried commenting the line WSGIRestrictEmbedded On from the python 3 server and restarting apache since python 2' doesn't have it, but it didn't help either.

What else could it be ? What other info do you need ?

Thanks a lot for taking the time to troubleshoot this issue.

EDIT : I found this issue on StackExchange where apparently there is a bug in ubuntu 22.04 not necessarily related to apache/mod_wsgi but to NFS. Is it possible that it's related to my issue ?

Best regards

@eduardo-boutiquenumerique
Copy link
Author

Hi Graham,

I have done more tests and it's really strange.

I have tested apache+mod_wsgi with python 3.7.17 with ubuntu 20.04 (older version, the same that runs currently with python 2.7.8) and with ubuntu 23.10 (newest version, but not LTS).

In the newest version, 23.10, the ajax calls' response times are about 2,5s (better than in the latest LTS version 22.04)
In the older version, 20.04, the ajax calls' response times are about 2s (better then in the newer versions of ubuntu).

I'm running on aws, same EC2 instance types, same regions and availability zones.

So far, regarding apache+mod_wsgi with python 3.7.17, the better I got was is with the older ubuntu version.

Now I have also tested nginx+uwsgi in both versions.

In 23.10, it's was like the others, around 2 seconds.
In 20.04, it was muche better. Around 1s. It's a little bit slower than python 2.7.18 with apache+mod_wsgi, but it's the fastest I got with python 3.7.17.

Let me know your thoughts.

Best

@GrahamDumpleton
Copy link
Owner

How are you measuring the response times and where? Do you have a monitoring system in place to capture continually metrics over time?

The mod_wsgi module has inbuilt capabilities to capture run times for individual requests which separates time spent by Apache routing request through to the mod_wsgi daemon process, vs time spent in the actual Python WSGI application, as well as other information. This information can be used to determine whether it is really a Python WSGI application issue, or perhaps an Apache configuration issue, or strange interaction for host OS.

Worst case you could just have this metric information dumped to the Apache error logs to try and validate where time is spent, or best case if using a metrics database like InfluxDB already, then you could record metrics in it and then analyse them or chart them with Grafana.

BTW, does your Python WSGI application in handling a request talk to any backend services such as a database etc, where time could be getting consumed?

@eduardo-boutiquenumerique
Copy link
Author

I'm measuring it in a very basic manner, on the front end side. I'm looking at the chrome's dev console to see response times.

So far, I'm using standard apache log that is not dumping time spent. How can I include this timing info ?

When using nginw+uwsgi, uwsgi dumps its processing time by default and the numbers are very similar (and obviously a bit smaller than) the ones I'm getting on the front end. So I know nginx is not adding a lot to it.

I do access an external db, but its access should not depend on the webserver and wsgi connector used, correct ? Time is being consumed here equally regardless of the webserver+wsgi connector used.

@GrahamDumpleton
Copy link
Owner

It has been a while since I have used this so don't remember the specific details of what it outputs, and I can't test the code right now as don't have access to normal computer since I am travelling, but can you add this code to your WSGI script file containing the entrypoint for your WSGI application.

import mod_wsgi

def event_handler(name, **kwargs):
    if name == 'request_finished':
        print("REQUEST-FINISHED", kwargs)

mod_wsgi.subscribe_events(event_handler)

Post the output from the logs for a couple of requests and I will explain what the information is telling you.

@eduardo-boutiquenumerique
Copy link
Author

Here you go :

[Sun Apr 28 23:08:52.979445 2024] [wsgi:error] [pid 28443:tid 126295335753280] [remote 172.31.40.121:39532] REQUEST-FINISHED {'request_id': 'X+4Ii5YWfG4', 'input_reads': 0, 'input_length': 0, 'input_time': 0.0, 'output_length': 124431, 'output_writes': 1, 'cpu_user_time': 2.439997, 'cpu_system_time': 0.09912400000000002, 'output_time': 0.004037, 'application_finish': 1714338532.979374, 'application_time': 2.808935, 'request_data': {}}
[Sun Apr 28 23:09:17.095141 2024] [wsgi:error] [pid 28441:tid 126295335753280] [remote 172.31.40.121:8406] REQUEST-FINISHED {'request_id': 'A9F2jBYSfG4', 'input_reads': 1, 'input_length': 239, 'input_time': 5.3e-05, 'output_length': 71463, 'output_writes': 1, 'cpu_user_time': 2.5864070000000003, 'cpu_system_time': 0.066882, 'output_time': 0.003764, 'application_finish': 1714338557.095029, 'application_time': 2.942438, 'request_data': {}}
[Sun Apr 28 23:09:30.766794 2024] [wsgi:error] [pid 28442:tid 126295335753280] [remote 172.31.40.121:51646] REQUEST-FINISHED {'request_id': 'AtdGjTYQfG4', 'input_reads': 1, 'input_length': 297, 'input_time': 5.1e-05, 'output_length': 71545, 'output_writes': 1, 'cpu_user_time': 2.5978950000000003, 'cpu_system_time': 0.08508900000000001, 'output_time': 0.000315, 'application_finish': 1714338570.766719, 'application_time': 2.976204, 'request_data': {}}
[Sun Apr 28 23:11:32.526071 2024] [wsgi:error] [pid 28443:tid 126295335753280] [remote 172.31.40.121:25962] REQUEST-FINISHED {'request_id': 'v3ullKYVfG4', 'input_reads': 1, 'input_length': 298, 'input_time': 1.9e-05, 'output_length': 87483, 'output_writes': 1, 'cpu_user_time': 2.1234280000000005, 'cpu_system_time': 0.02854499999999996, 'output_time': 0.000481, 'application_finish': 1714338692.525987, 'application_time': 2.454169, 'request_data': {}}
[Sun Apr 28 23:11:41.607037 2024] [wsgi:error] [pid 28441:tid 126295335753280] [remote 172.31.40.121:26852] REQUEST-FINISHED {'request_id': '/YIwldYSfG4', 'input_reads': 1, 'input_length': 299, 'input_time': 1.8e-05, 'output_length': 87492, 'output_writes': 1, 'cpu_user_time': 2.105351, 'cpu_system_time': 0.01347799999999999, 'output_time': 0.003815, 'application_finish': 1714338701.606966, 'application_time': 2.423794, 'request_data': {}}
[Sun Apr 28 23:11:46.811229 2024] [wsgi:error] [pid 28442:tid 126295335753280] [remote 172.31.40.121:26852] REQUEST-FINISHED {'request_id': 'KLJ/lZYVfG4', 'input_reads': 1, 'input_length': 300, 'input_time': 1.5e-05, 'output_length': 89669, 'output_writes': 1, 'cpu_user_time': 2.1134270000000006, 'cpu_system_time': 0.012809999999999988, 'output_time': 0.004072, 'application_finish': 1714338706.811162, 'application_time': 2.43856, 'request_data': {}}
[Sun Apr 28 23:11:54.261513 2024] [wsgi:error] [pid 28443:tid 126295335753280] [remote 172.31.40.121:26852] REQUEST-FINISHED {'request_id': 'mDHvlcYVfG4', 'input_reads': 1, 'input_length': 371, 'input_time': 1.9e-05, 'output_length': 88319, 'output_writes': 1, 'cpu_user_time': 2.1299459999999995, 'cpu_system_time': 0.03730899999999998, 'output_time': 6.4e-05, 'application_finish': 1714338714.26146, 'application_time': 2.581733, 'request_data': {}}

The application_time seems to be the time spend by mod_wsgi request, it's pretty consistent with what I get on the frontend.

If this is the case, as far as I understand, the issue is not with apache itself, but rather within mod_wsgi. Can we get more details of what's happening inside mod_wsgi ?

@GrahamDumpleton
Copy link
Owner

That event doesn't capture some of the details I was hoping to.

Can you change it to:

import mod_wsgi

def event_handler(name, **kwargs):
    if name == 'request_started':
        print("REQUEST-STARTED", kwargs)
    elif name == 'request_finished':
        print("REQUEST-FINISHED", kwargs)

mod_wsgi.subscribe_events(event_handler)

and post some sample output again.

That said, since cpu_user_time is a large chunk of applicaton_time suggests it is taking that long to do the work and that it is a pretty CPU intensive task as well, and not just waiting on I/O to happen.

So far the only comments can make are:

  • Initially Python 3 was a bit slower than Python 2. I am not aware of whether performance has improved over time so that Python 3 is faster than Python 2. So not sure if still have to be careful comparing Python 2 to 3. May also depend on performance of third party Python packages when run on Python 3 compared to Python 2.

  • uWSGI is known to actually cheat in some ways. In its default configuration it used to disable muiltithreading support in Python such that if you did basic tests with single request at a time it would look very fast, especially in comparison to mod_wsgi, which always has multithreading enabled in the interpreter. If however you then configured uWSGI so that it properly handled concurrent requests and multithreading, performance would drop and be comparable to mod_wsgi.

I don't know if uWSGI ever changed its defaults and since don't know how you are configuring uWSGI I can't say whether your better performance is due to that trick it uses to make itself look better, or some other things it does around CPU affinity to ensure requests land on different CPUs if possible.

Anyway, post the REQUEST-STARTED data as well and we can at least confirm that Apache queueing/processing time is not an issue.

BTW, what sort of CPU intensive task are you doing?

@eduardo-boutiquenumerique
Copy link
Author

Thanks for your reply.

I'm attaching the log file with the output you requested here : mod_wsgi.log.

Regarding your other questions, this is my uWSGI config:

[uwsgi]
strict = true

# process-related settings
master = true
processes = 5
enable-threads = true
# we need lazy-apps or py-call-uwsgi-fork-hooks otherwise sentry raises an error.
# py-call-uwsgi-fork-hooks raises other errors, so we went with lazy-apps
lazy-apps = true
threads = 1
max-requests = 5000
vacuum = true
die-on-term = true
single-interpreter = true
need-app = true

chdir = /home/ubuntu/workspace/project_name
home = /home/ubuntu/.pyenv/versions/project_name-3-7-17
logto2 = /home/ubuntu/workspace/project_name/%N/log/uwsgi.log
module = %N.project.wsgi:application

uid = ubuntu
gid = ubuntu
env = DJANGO_SETTINGS_MODULE=%N.project.settings
env = DJANGO_CONFIGURATION=Production

safe-pidfile = /run/uwsgi/%N.pid
socket = /run/uwsgi/%N.sock
chmod-socket = 664

stats = /tmp/uwsgi_stats_%N.sock

Those ajax requests calculate prices for a product based on many different criteria. Actually, there is not a lot I/O as you mentioned because everything is cached using lru_cache so the first requests hit the db to get the coefficients and then they are cached. The responses times I've been mentioning since the beginning in all cases are not for the first requests, but after everything is cached (or at least it should be).

BTW, ubuntu 24.04 was just released and I'll definitely give it try when I upgrade my python version to 3.11 or 3.12 in the following months.

Let me know your thoughts.

Thanks a lot for your help and time. I really appreciate it.

Best regards
Eduardo

@GrahamDumpleton
Copy link
Owner

In REQUEST-STARTED you will see:

REQUEST-STARTED {
  'request_id': 'EcXJbArZQG4',
  'thread_id': 1,
  'daemon_connects': 1,
  'daemon_restarts': 0,
  'request_start': 1714423920.706833,
  'queue_start': 1714423920.706965,
  'daemon_start': 1714423920.707038,
  'application_start': 1714423920.70718
}

The meaning of time values are:

  • request_start - When Apache accepts request of socket connection in Apache child worker process.
  • queue_start - When the mod_wsgi code first attempts to connect to the mod_wsgi daemon process socket from the Apache child worker process.
  • daemon_start - When proxied connection from the Apache child worker process is accepted in the mod_wsgi daemon process.
  • application_start - When the WSGI application is called.

Thus:

  • (1) Apache child worker process request handler setup time - queue_start - request_start
  • (2) How long request stuck waiting for daemon process to be ready to handle a request - daemon_start - queue_start
  • (3) Daemon process request handler setup time - application_start - daemon_start

So in this case:

  • (1) 0.000132 (132ms)
  • (2) 0.000073 (73ms)
  • (3) 0.000142 (142ms)

If (2) was significant, then would indicate getting backlog where don't have enough capacity in daemon processes to handle request capacity.

If (3) was significant, suggests that WSGI application code/data is being loaded first time as process/thread not previously handled a request.

Anyway, that is meaning of those values if want to try and capture them and report then into some metrics tracking system.

One thing I would double check is what overall response time is if Sentry is disabled and the Sentry wrapper not applied. If it is only doing error capture and no metrics it should not really affect anything, but to be safe I would make sure.

@eduardo-boutiquenumerique
Copy link
Author

I've tested without Sentry and the results are the same.
I'll let you know if the results with ubuntu 24.04 and a newer python version are better (in a few weeks/months)
Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants