Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly performance regression in the latest versions of locust #2690

Open
2 tasks done
morrisonli76 opened this issue Apr 26, 2024 · 9 comments
Open
2 tasks done

Possibly performance regression in the latest versions of locust #2690

morrisonli76 opened this issue Apr 26, 2024 · 9 comments
Labels

Comments

@morrisonli76
Copy link

Prerequisites

Description

I used to use Amazon Linux 2 as the base OS for my load tests. Because the python available on that OS is 3.7, the latest locust I could get was 2.17.0. With 5 c5n.xlarge EC2 instances (each has 4 vCPU) as workers, I could use spawn 1200 users. The wait_time for the test was set to constant_thoughtput(1) so that the total 1200 rps stress could be achieved.

Recently, I updated the base OS to Amazon Linux 2023. The python version became 3.11. I could use the latest version of locust - 2.26.0. However, the above setup (5 c5n.xlarge EC2 instances) could not provide the desired load. It could only spawn totally about 830 users. The total rsp was only around 330 even though the wait_time was still constant_thoughtput(1). I noticed that CPU usage of each worker process was close to 100% already.

The server being tested did not change. The same locustfile was used for tests. However, the performance between the above 2 locust setup was day and night difference. This seems like a regression.

Here is the output of the python 3.11 environment:
Package Version

blinker 1.7.0
Brotli 1.1.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
ConfigArgParse 1.7
Flask 3.0.3
Flask-Cors 4.0.0
Flask-Login 0.6.3
gevent 24.2.1
geventhttpclient 2.2.1
greenlet 3.0.3
idna 3.7
itsdangerous 2.2.0
Jinja2 3.1.3
locust 2.26.0
MarkupSafe 2.1.5
msgpack 1.0.8
pip 22.3.1
psutil 5.9.8
pyzmq 26.0.2
requests 2.31.0
roundrobin 0.0.4
setuptools 65.5.1
urllib3 2.2.1
Werkzeug 3.0.2
zope.event 5.0
zope.interface 6.3

Command line

master side: locust -f /opt/locustfile.py --master worker side: locust -f - --worker --master-host <master_ip> --processes -1

Locustfile contents

class QuickstartUser(HttpUser):
    def on_start(self):
        self.pixel_ids = self.environment.parsed_options.pixel_ids.split(",")
        self.client.verify = True if self.environment.parsed_options.verify_cert.lower() == "true" else False

    @task
    def cloudbridge(self):
        pixel_id = random.choice(self.pixel_ids)
        event_body = {
            "fb.pixel_id": pixel_id,
            "event_id": generate_event_id(),
            "event_name": self.environment.parsed_options.event_name,
            "conversion_value": {
                "value": "9",
                "currency": "USD",
            },
        }
        self.client.post(self.environment.parsed_options.path, json=event_body, name="event")
        self.client.close()

    wait_time = constant_throughput(2)

Python version

3.11

Locust version

2.26.0

Operating system

Amazon Linux 2023

@cyberw
Copy link
Collaborator

cyberw commented Apr 26, 2024

Hmm... There IS a known performance regression in OpenSSL 3.x (which was usually introduced in Python 3.12, but maybe your python build is different somehow?), see #2555

The issue will hit tests which close/reopen the connection especially hard (as the issue arises at ssl negotiation)

Can you check to see which ssl version you are running?
python -c "import ssl; print(ssl.OPENSSL_VERSION)"

As a workaround, see if you can run run another python version or keep connections alive (I know, not as realistic but better than nothing)

@morrisonli76
Copy link
Author

Hi, I used ubuntu 20.04 for Amazon EC2. I managed install the python 3.10 and the latest locust.

The CPU usage became low. However, the through put did not follow the constant_throughput(1) spec. 1500 users only gave me less than 800 rps.

Here is my python env:

(locust_env) ubuntu@ip-172-31-10-204:$ locust -V
locust 2.26.0 from /opt/locust_env/lib/python3.10/site-packages/locust (python 3.10.14)
(locust_env) ubuntu@ip-172-31-10-204:
$ python3.10 -m pip list
Package Version


blinker 1.8.1
Brotli 1.1.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
ConfigArgParse 1.7
Flask 3.0.3
Flask-Cors 4.0.0
Flask-Login 0.6.3
gevent 24.2.1
geventhttpclient 2.2.1
greenlet 3.0.3
idna 3.7
itsdangerous 2.2.0
Jinja2 3.1.3
locust 2.26.0
MarkupSafe 2.1.5
msgpack 1.0.8
pip 24.0
psutil 5.9.8
pyzmq 26.0.2
requests 2.31.0
roundrobin 0.0.4
setuptools 69.5.1
tomli 2.0.1
urllib3 2.2.1
Werkzeug 3.0.2
wheel 0.43.0
zope.event 5.0
zope.interface 6.3

@cyberw
Copy link
Collaborator

cyberw commented May 10, 2024

Hi! Did you check your ssl version?

python -c "import ssl; print(ssl.OPENSSL_VERSION)"

@morrisonli76
Copy link
Author

Yes, I did that. In fact I used ubuntu 20.04 which uses openssl 1.1.1f. I also updated the python to 3.10. With this setup, the CPU usage was lower, however, I found that even if I set wait = constant_throughput(1) for the test user, 1500 users only gave me less than 800 rps (I have already mentioned this in my previous reply). I did not see this issue when I use locust 2.17.0.

@cyberw
Copy link
Collaborator

cyberw commented May 11, 2024

What are your response times like? Wait times can only limit throughput, not increase it, so if a task takes more than 1s to complete you wont get 1 request/user/s.

@morrisonli76
Copy link
Author

The average response time is less than 700ms. Also, when I used older version of locust (e.g. 2.17.0), I did not have this issue.

@cyberw
Copy link
Collaborator

cyberw commented May 13, 2024

Hmm.. only thing I can think of is if Amazon is throttling somehow. What if you skip closing the session/connection? Can you see how many dns lookups are made? (Using tcpdump or something else). If you close the session then maybe there is a new dns lookup for each task iteration?

@morrisonli76
Copy link
Author

I can take a look if there is new dns lookup. However, the same target server and same tests, why locust 2.17.0 did not have the issue. Any major change to the connection logic?

@cyberw
Copy link
Collaborator

cyberw commented May 13, 2024

Not that I can think of :-/ But does 2.17.0 not exhibit this problem on python 3.11/Amazon Linux 2023?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants