Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hug.test.call performance issues #878

Open
ianthetechie opened this issue Dec 8, 2020 · 4 comments
Open

hug.test.call performance issues #878

ianthetechie opened this issue Dec 8, 2020 · 4 comments

Comments

@ianthetechie
Copy link
Contributor

Hey Tim,

I stumbled across an issue when digging into why our unit test suite was performing poorly. I'm not sure whether this is an issue with hug or falcon, but here's the quick overview. We have a suite of around 1000 tests that we run in pytest. We noticed running times were creeping up to around 30 mins. I initially blamed CI runners and questionable database operations but dug a little deeper with a profiler and discovered that around 75% of running time was spent constructing falcon APIs, of all things...

photo_2020-12-09 01 25 19

This was, to say the least, a bit surprising. So I dug in a bit more and realized that hug.test.call is creating a new server every single invocation, which is to say, a few thousand invocations per test run. TL;DR, I was able to cut runtimes by about 80% by not doing this. As for what to do about it, I'd welcome your input. This may be an issue with falcon, but my gut says this really shouldn't be something hug is doing.

I have hazarded a guess at the general direction of a fix, but I admit elements of this are quite hacking and I'm almost positive this isn't the right approach. The fixes may be more appropriately applied to the server() method, but I tried to keep my own changes as surgical as possible and relegated to the test framework. We are currently using this code internally. I would welcome your feedback on this and can open a PR with a cleaned up solution if this is on the right track.

Cheers!

def __freeze(d):
    """Converts a nested structure of objects that are not hashable as-is into something that's probably hashable.

    Not formally tested, but empirically works for this limited use case here.
    """
    if isinstance(d, dict):
        return frozenset((key, __freeze(value)) for key, value in d.items())
    elif isinstance(d, Iterable):
        return tuple(__freeze(value) for value in d)
    elif hasattr(d, "__eq__") or hasattr(d, "__hash__") or hasattr(d, "__cmp__"):
        return d
    else:
        raise Exception(f"Unhashable object: {d}")


def call(
    method,
    api_or_module,
    url,
    body="",
    headers=None,
    params=None,
    query_string="",
    scheme="http",
    host=DEFAULT_HOST,
    **kwargs,
):
    """Simulates a round-trip call against the given API / URL."""
    http = API(api_or_module).http

    # Generate a hashable key that represents what I understand to be the important state
    # of the API that is subject to change.
    key = (
        http.api.future,
        __freeze(http.not_found_handlers),
        __freeze(http.sinks),
        __freeze(http.routes),
    )

    # Cache the resulting falcon server, because these are apparently SUPER
    # expensive to reconstruct every call (to the tune of ~80% of our test running time).
    if key in server_cache:
        api = server_cache[key]
    else:
        api = http.server()
        server_cache[key] = api

    response = StartResponseMock()
    headers = {} if headers is None else headers
    if not isinstance(body, str) and "json" in headers.get(
        "content-type", "application/json"
    ):
        body = output_format.json(body)
        headers.setdefault("content-type", "application/json")

    params = params if params else {}
    params.update(kwargs)
    if params:
        query_string = "{}{}{}".format(
            query_string, "&" if query_string else "", urlencode(params, True)
        )
    result = api(
        create_environ(
            path=url,
            method=method,
            headers=headers,
            query_string=query_string,
            body=body,
            scheme=scheme,
            host=host,
        ),
        response,
    )
    if result:
        response.data = hug.test._internal_result(result)
        response.content_type = response.headers_dict["content-type"]
        if "application/json" in response.content_type:
            response.data = json.loads(response.data)
    return response


# Swizzle the hug test call method for better performance.
# Turns out ~80% of our test time is just constructing routers and
# ASTs and crap in falcon...
hug.test.call = call
@vytas7
Copy link
Contributor

vytas7 commented Dec 15, 2020

Hey there 👋

I haven't used Hug much, but could this be the same issue as falconry/falcon#1550 ?

If that is the root cause, it has been fixed; however the fix is not a part of any stable release yet.
@ianthetechie I don't know if Hug fully supports Falcon 3.0 out of the box, and you may run into other issues instead.
But it would be awesome if you could give falcon==3.0.0a3 a spin, and check if it improves your test suite performance.

@mlavin
Copy link

mlavin commented Dec 15, 2020

I've noticed similar issues with the test suite run times when using hug.test.call and the similar method based calls. Using falcon==3.0.0a3 dramatically improves that run time (12+ mins to ~100s). There does seem to be at least some incompatibility either in hug or my usage because I have a couple failures as well as deprecation warnings. Thanks for the heads up about this Falcon issue!

@CaselIT
Copy link

CaselIT commented Dec 29, 2020

Sorry for the complete ot, @ianthetechie I'm interested in knowing what tool(s) did you use to generate the graph?

@ianthetechie
Copy link
Contributor Author

Sorry I haven't had a chance to test with the falcon 3.0 alpha yet but I haven't forgotten ;)

@CaselIT the graph was generated by the PyCharm profiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants