Performance impact #38

corbett5 · 2023-04-29T03:55:59Z

I just came across this repository after looking for a solution to this exact problem, very excited! However when I was trying it out every line now goes through get_ipython().run_cell_magic('space', ...). This adds a few layers to every exception backtrace. Not a huge deal. However, this also comes with a pretty severe performance hit. See the following example

%%time
# %%space <first>

result = 0
for i in range(10000000):
    string = f'{10 * i / (3 * i + 7)}{87 * i * i / 258}'
    result += ord(string[i % len(string)])

print(result)

Without the %%space <first> the results are

CPU times: user 19.9 s, sys: 296 ms, total: 20.2 s
Wall time: 22.5 s

With %%space <first> the results are

CPU times: user 51.9 s, sys: 546 ms, total: 52.4 s
Wall time: 55.4 s

The text was updated successfully, but these errors were encountered:

davidesarra · 2023-05-02T21:42:49Z

Hi Ben! Thanks for opening this issue 👍

The performance hit comes from the capability of the code in the space to access not only their own private namespace but also the user namespace, i.e. the notebook namespace. Usually the performance hit isn't noticeable. However, in snippets like the example above it becomes very noticeable as it performs tens of millions of lookups.

At the moment I see three options:

Speed up the current approach

At the cost of being less Pythonic, we could speed up the execution by about 20-25% in the case of the example above.
Isolated Space

We could consider introducing in addition to the existing nested space type an isolated space type that doesn't have access to the user namespace. While it may be a bit fiddly to extend the current API, isolated spaces would have nearly the same performance as code outside spaces.
Leave it as it is

Depending on the workaround available and the rarity of the affected use cases, it may be appropriate to leave it as it. I appreciate that the example provided above may contrived as it is generally avoided having Python loops with very many iterations (in favour of using libraries that move these loops outside Python).

I'd love to hear your opinion especially on 2) and 3). Namely:

When you have a need like in the example above do you need access to the user namespace?
What are the typical use cases where the performance penalty of using a space becomes noticeable?
How often do these use cases come up?
What's your current workaround? Not isolating the code in a space?

Thank you in advance for your feedback! 😄

corbett5 · 2023-05-19T17:38:31Z

Sorry for the long delay.

I think that leaving be is fine. For example if you put the code into a function and then define the function in a cell without a space magic then you get no performance hit. For my use cases, this is pretty much how I would use this magic anyways, define a bunch of global functions, then call them from some namespaces.

That said, I think the documentation should make note of this somewhere, both in terms of the performance hit and the added frames to backtraces.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance impact #38

Performance impact #38

corbett5 commented Apr 29, 2023

davidesarra commented May 2, 2023

corbett5 commented May 19, 2023

Performance impact #38

Performance impact #38

Comments

corbett5 commented Apr 29, 2023

davidesarra commented May 2, 2023

corbett5 commented May 19, 2023