Only use flask for debug #1607

david4096 · 2017-03-09T01:58:50Z

~~This PR provides an experimental "run under gunicorn" option to ga4gh_server.~~

This PR runs the server behind the gunicorn WSGI server when running via ga4gh_server. This makes the software much easier to use and more practical for people who don't know what "Apache" or "nginx" are.

This spawns multiple worker processes that can each handle their own requests. This standalone WSGI HTTP Server is well tested with Flask, however, our implementation includes some challenges. The first is that TLS mode won't work directly, and logging needs to be handled differently.

I'd like to get this experimental feature out there for folks to test, I think that by default when running ga4gh_server we would like to run behind gunicorn as opposed to the SimpleHTTPServer. It would allow us to simplify the Apache instructions, and the dockerfile. For apache, one simply sets up a proxy and doesn't need to use mod_wsgi.

(env) ➜  ga4gh-server git:(1548_gunicorn_optional) ✗ python server_dev.py
--------------------------------------------------------------------------------
DEBUG in __init__ [ga4gh/server/network/__init__.py:54]:
Peer already in registry http://1kgenomes.ga4gh.org UNIQUE constraint failed: peer.url
--------------------------------------------------------------------------------
[2017-03-08 17:56:01 +0000] [16971] [INFO] Starting gunicorn 19.7.0
[2017-03-08 17:56:01 +0000] [16971] [INFO] Listening at: http://127.0.0.1:8000 (16971)
[2017-03-08 17:56:01 +0000] [16971] [INFO] Using worker: sync
[2017-03-08 17:56:01 +0000] [16976] [INFO] Booting worker with pid: 16976
[2017-03-08 17:56:01 +0000] [16977] [INFO] Booting worker with pid: 16977
[2017-03-08 17:56:01 +0000] [16978] [INFO] Booting worker with pid: 16978
[2017-03-08 17:56:01 +0000] [16979] [INFO] Booting worker with pid: 16979
[2017-03-08 17:56:01 +0000] [16980] [INFO] Booting worker with pid: 16980
[2017-03-08 17:56:01 +0000] [16981] [INFO] Booting worker with pid: 16981
[2017-03-08 17:56:01 +0000] [16982] [INFO] Booting worker with pid: 16982
[2017-03-08 17:56:01 +0000] [16983] [INFO] Booting worker with pid: 16983
[2017-03-08 17:56:01 +0000] [16984] [INFO] Booting worker with pid: 16984

Close #1548

TODO

~~Verify if auth works in this configuration~~
~~Set up logging to stdout (where are worker logs redirected to?)~~
~~Make gunicorn default production run setting~~

codecov-io · 2017-03-09T02:13:58Z

Codecov Report

Merging #1607 into master will decrease coverage by 0.07%.
The diff coverage is 58.33%.

@@            Coverage Diff             @@
##           master    #1607      +/-   ##
==========================================
- Coverage   86.65%   86.57%   -0.08%     
==========================================
  Files          33       33              
  Lines        7471     7494      +23     
  Branches      926      928       +2     
==========================================
+ Hits         6474     6488      +14     
- Misses        805      813       +8     
- Partials      192      193       +1

Impacted Files	Coverage Δ
ga4gh/server/cli/server.py	`66.03% <58.33%> (-3.97%)`	❌

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec57524...6f59917. Read the comment docs.

david4096 · 2017-03-10T06:26:27Z

To better demonstrate what this gives us, I ran the locust load testing suite with this locustfile against this PR in two modes. This test is a little better isolated than the previous which used curl. Locust uses gevent to spawn workers at a lower cost on the system.

First with the -d option which uses the flask debug we already ship with. 68 requests/second


"Method"	"Name"	"# requests"	"# failures"	"Median response time"	"Average response time"	"Min response time"	"Max response time"	"Average Content Size"	"Requests/s"
"GET"	"/"	42	1	5000	14854	1254	67302	16007	0.32
"POST"	"/datasets/search"	565	2	2200	5941	3	69022	158	4.36
"POST"	"/features/search"	631	3	2100	6016	3	67749	2	4.87
"POST"	"/featuresets/search"	631	0	2000	5738	5	67694	364	4.87
"POST"	"/readgroupsets/search"	669	1	2000	6138	13	69053	23775	5.16
"POST"	"/reads/search"	6355	19	2000	5938	3	69506	10048	49.02
"None"	"Total"	8893	26	2000	5987	3	69506	9081	68.60

And then with the gunicorn server (running without -d), 130 requests/sec


"Method"	"Name"	"# requests"	"# failures"	"Median response time"	"Average response time"	"Min response time"	"Max response time"	"Average Content Size"	"Requests/s"
"GET"	"/"	18	0	800	796	669	921	16007	0.22
"POST"	"/datasets/search"	747	0	370	514	3	2683	158	9.04
"POST"	"/features/search"	748	0	390	527	3	2630	2	9.05
"POST"	"/featuresets/search"	800	0	390	521	3	2634	364	9.68
"POST"	"/readgroupsets/search"	802	0	440	555	8	2492	23775	9.70
"POST"	"/reads/search"	7645	0	430	557	3	2760	9718	92.51
"None"	"Total"	10760	0	420	549	3	2760	8741	130.20

Note the failures with the builtin flask server, which doesn't handle simultaneous requests well.

david4096 · 2017-03-10T19:37:05Z

If we merge this we should follow it with a PR that improves test coverage under gunicorn mode in test_gestalt.

Tests that involve ensuring an announce request is sent and received will be easier to write, I believe.

david4096 · 2017-03-10T21:39:56Z

I don't want to rush this into the release, but I would love to get feedback!

dcolligan

Generally looks fine as long as we never run gunicorn unless explicitly invoked

dcolligan · 2017-03-13T14:16:49Z

ga4gh/server/cli/server.py

    parser = common_cli.createArgumentParser("GA4GH reference server")
    addServerOptions(parser)
    return parser


+def number_of_workers():
+    return (multiprocessing.cpu_count() * 2) + 1


should probably have a comment here justifying this formula

Lifted from here http://docs.gunicorn.org/en/latest/custom.html?highlight=multiprocessing I'll add a link

dcolligan · 2017-03-13T14:17:30Z

ga4gh/server/cli/server.py

-        help="Don't use the flask reloader")
+        help="Don't use the Flask reloader (for Flask debug)")
+    parser.add_argument(
+        "--debug", "-d", action='store_true', default=False,


the docs say the -g flag is to run gunicorn

Thanks, I started by making running under gunicorn a flag, but I believe the default functionality when running ga4gh_server should be to run in a multiworker environment. The -d will run using the Flask debug server (werkzeug).

dcolligan · 2017-03-13T14:18:57Z

tests/end_to_end/server.py

@@ -194,6 +194,7 @@ def getCmdLine(self):
 python server_dev.py
 --dont-use-reloader
 --disable-urllib-warnings
+-d


best to use the non-shorthand version of the flag here, for clarity in reading this file

dcolligan · 2017-03-13T14:21:26Z

ga4gh/server/cli/server.py

-        host=parsedArgs.host, port=parsedArgs.port,
-        use_reloader=not parsedArgs.dont_use_reloader,
-        ssl_context=sslContext)
+    if parsedArgs.debug:


I don't get why this is called debug ... seems like this should be run_guincorn or something...

david4096 force-pushed the 1548_gunicorn_optional branch 2 times, most recently from ea9874e to 5fac2e3 Compare March 9, 2017 02:05

david4096 added 7 commits March 9, 2017 20:45

gunicorn

268379e

Multiprocessing sets the workers

860f8a1

Add gunicorn requirement

562252b

Refactoring server cli

ba6a8bb

"-g" gunicorn option to server CLI

70f4ceb

Add docs for running the server

55e4b52

Remove -g option, make flask only run from werkzeug in debug mode

52ebcb4

david4096 changed the title ~~Add gunicorn option~~ Only use flask for debug Mar 10, 2017

Add tests and flake fixes

9164eef

david4096 force-pushed the 1548_gunicorn_optional branch from 5fac2e3 to 9164eef Compare March 10, 2017 05:30

Force a gunicorn version

2a31f78

Merge branch 'master' into 1548_gunicorn_optional

6badbb5

david4096 requested review from ejacox, dcolligan and kozbo March 10, 2017 21:39

Merge branch 'master' into 1548_gunicorn_optional

6f59917

dcolligan reviewed Mar 13, 2017

View reviewed changes

david4096 mentioned this pull request Mar 15, 2017

Fail to test Apache deployment with demo dataset #1563

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only use flask for debug #1607

Only use flask for debug #1607

david4096 commented Mar 9, 2017 •

edited

codecov-io commented Mar 9, 2017 •

edited

david4096 commented Mar 10, 2017 •

edited

david4096 commented Mar 10, 2017 •

edited

david4096 commented Mar 10, 2017

dcolligan left a comment

dcolligan Mar 13, 2017

david4096 Mar 13, 2017

dcolligan Mar 13, 2017

david4096 Mar 13, 2017 •

edited

dcolligan Mar 13, 2017

dcolligan Mar 13, 2017

Only use flask for debug #1607

Are you sure you want to change the base?

Only use flask for debug #1607

Conversation

david4096 commented Mar 9, 2017 • edited

codecov-io commented Mar 9, 2017 • edited

Codecov Report

david4096 commented Mar 10, 2017 • edited

david4096 commented Mar 10, 2017 • edited

david4096 commented Mar 10, 2017

dcolligan left a comment

Choose a reason for hiding this comment

dcolligan Mar 13, 2017

Choose a reason for hiding this comment

david4096 Mar 13, 2017

Choose a reason for hiding this comment

dcolligan Mar 13, 2017

Choose a reason for hiding this comment

david4096 Mar 13, 2017 • edited

Choose a reason for hiding this comment

dcolligan Mar 13, 2017

Choose a reason for hiding this comment

dcolligan Mar 13, 2017

Choose a reason for hiding this comment

david4096 commented Mar 9, 2017 •

edited

codecov-io commented Mar 9, 2017 •

edited

david4096 commented Mar 10, 2017 •

edited

david4096 commented Mar 10, 2017 •

edited

david4096 Mar 13, 2017 •

edited