Skip to content
This repository has been archived by the owner on Feb 3, 2021. It is now read-only.

Performance Tracking #130

Open
Snuggle opened this issue Dec 30, 2018 · 5 comments
Open

Performance Tracking #130

Snuggle opened this issue Dec 30, 2018 · 5 comments
Labels
Status: Available Type: Maintenance Non-functional changes (testing, project structure, CI, etc.)

Comments

@Snuggle
Copy link
Collaborator

Snuggle commented Dec 30, 2018

Feature Request

I think it would be a good idea to add some kind of stressful benchmark to Travis and track how different pull requests, commits etc. change Spacefish's performance as a prompt. We should make sure that Spacefish is as light and fast as possible by default and performs well during common tasks like spawning the prompt, switching to a git directory, switching to a locked directory etc.

Heck, the perfect thing would be a way to upload each speed benchmark to something like Grafana/InfluxDB and constantly track Spacefish's performance as commits are made.

@matchai
Copy link
Owner

matchai commented Dec 30, 2018

Sounds like a great idea! I made a local proof of concept a little while ago for this using hyperfine.

The simplest implementation that wouldn't require external services would be to just record performance benchmarks into a file and report the difference on GitHub before overwriting it.

@matchai matchai added the Type: Maintenance Non-functional changes (testing, project structure, CI, etc.) label Dec 30, 2018
@yozlet
Copy link

yozlet commented Jan 4, 2019

It's a great idea, but a note of warning: Apparently there's enough noise in performance data from cloud CI services (at least, from Travis) that additional mitigation steps would be necessary. See https://bheisler.github.io/post/benchmarking-in-the-cloud/

(I've not tried this kind of thing myself, but I'd love to find a reliable way to do it.)

@Snuggle
Copy link
Collaborator Author

Snuggle commented Jan 9, 2019

To quote a paragraph from that page:

One way to reduce noise in this system would be to execute each benchmark suite two or more times with each version of the code and accept the one with the smallest mean or variance before comparing the two. In this case, it would be best to run each benchmark suite to completion before running it again rather than running each test twice consecutively, to reduce the chance that some external influence affects a single test twice.

@Snuggle
Copy link
Collaborator Author

Snuggle commented Mar 25, 2019

@matchai Could I please see this proof-of-concept of yours?

@matchai
Copy link
Owner

matchai commented Mar 25, 2019

Unfortunately, that proof-of-concept is long gone and was just a small test.
I have since found what I think would be a better way to extract benchmark values from a fish script:

fish -p trace_file -c "fish_prompt"

Then skip to the line ending with > fish_prompt.

This will execute the fish_prompt function and generate a trace_file file for it. In the trace file we can see how much time was spent in each function.
We could potentially make a tool similar to next-stats-action which generates per-PR stats, like this: vercel/next.js#6752 (comment).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Status: Available Type: Maintenance Non-functional changes (testing, project structure, CI, etc.)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants