Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine maximum throughput of service request creation API #101

Open
sivachandran opened this issue Mar 19, 2021 · 1 comment
Open

Determine maximum throughput of service request creation API #101

sivachandran opened this issue Mar 19, 2021 · 1 comment

Comments

@sivachandran
Copy link
Contributor

As part of determining Clamp's capabilities, the maximum throughput of service request creation API needs to be determined

@sivachandran
Copy link
Contributor Author

Summary

Clamp is able to handle 16K service request APIs per second. More requests increase the response time and lead to the Gatling script running out of network port.

Screenshot 2021-03-19 at 2 30 32 PM

Screenshot 2021-03-19 at 2 32 21 PM

Gating Report

Setup

Two EC2 m4.2xlarge (8 vCPUs, 32GB RAM) instances were used for the testing.

One EC2 instance(named Clamp-Services) ran the Clamp and its dependencies. Clamp core built from branch service-req-creation-perf. The Clamp runtime environment was setup Docker Compose using specification this docker-compose.

The other EC2 instance ran this Gatling script. Note, the IP address of the Clamp needs to be updated with Clamp EC2 instance IP.

The security group associated with the EC2 instance is configured to allow TCP connection to port 8080, i.e., Clamp's listening port.

clamp-core changes

Clamp branch service-req-creation-perf has the following to achieve maximum throughput in the performance test.

  • Disabled GIN request time taken logging. Any log in hot code path reduces the throughput
  • Commented out the lines that make sure service request workflow exists. Though it is a required validation, the present implementation does it inefficiently. Workflows won't get deleted(no DELETE endpoint implemented as of this writing). Taking that as an advantage, Clamp can maintain application-level LRU cache workflows to avoid making DB repeatedly get workflow information. As a cache(i.e., not making DB calls) increases the throughput significantly, it is assumed the caching will be implemented and the DB query-based workflow validation is commented out.
  • Commented out the line that queues the service request to the service request worker. This is a required step in service request execution. But, it is implemented in a way that the service request creation API throughput would get limited by the service request worker queue/channel size. Moreover, the current implementation doesn't remember and resume executing pending service request if the Clamp instance is restarted for some reason(e.g., crash). A possible efficient and reliable implementation is to identify the pending service request execution from DB and scheduling them to the idle service request worker. This would avoid queuing from the service request API handler thus eliminating any throughput limitation because of the queuing. So, the current queuing implementation is commented out in the service request creation API handler to achieve maximum throughput.

core-benchmarking changes

The docker-compose.yml has to be modified to set a maximum limit for Clamp's ulimit. The changes are in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant