Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text instead of String throughout Tidal #873

Open
dktr0 opened this issue Nov 26, 2021 · 10 comments
Open

Text instead of String throughout Tidal #873

dktr0 opened this issue Nov 26, 2021 · 10 comments

Comments

@dktr0
Copy link
Contributor

dktr0 commented Nov 26, 2021

If we could use Text instead of String wherever possible, throughout the Tidal codebase, that would translate into performance advantages when Tidal code is run in the browser (ie. in Estuary). Happy to help do this re-factor if there's support for it! (Having done it a while ago in the evolution of the Estuary codebase already...)

@yaxu
Copy link
Member

yaxu commented Nov 26, 2021

Yes sounds good.
Have a look at the existing benchmarking suite before / after to see what difference it makes.

@yaxu
Copy link
Member

yaxu commented Nov 26, 2021

It would be good to do some profiling too. This looks useful:
https://hackage.haskell.org/package/profiterole

When you say help refactor, do you mean do the whole job? :) That would be great if you already have a handle on what is involved. With overloaded strings already on for everyone I expect there won't be any changes to the ui.
I'm interested in optimisation at the moment as I'm working on tidal on the low powered pi zero. Are there particular concerns for estuary beyond a general need for optimisation due to being in a javascript vm?

@dktr0
Copy link
Contributor Author

dktr0 commented Nov 26, 2021

Yeah, I can take a look at "the whole thing". It might not be such a big job - the thorniest parts will probably be those connected to parsing (eg. mini-notation), where the gains/benefits are also somewhat smaller (lists being not terrible when you are looking ahead a few characters at a time). The String vs. Text thing is low-hanging fruit in terms of optimizing for Estuary - Strings in GHCJS (like in GHC) are linked lists (one node per character...) so even something like "Tidal" becomes a whole nested structure of objects within objects, while Text is mapped directly onto an underlying JS string.

@dktr0
Copy link
Contributor Author

dktr0 commented Nov 26, 2021

As to other concerns... not really. I think the performance-challenged environment of the browser definitely exposes performance costs that might seem negligible otherwise. I have the impression one general source of slowdown is situations where "variants" of underlying patterns are calculated and then combined (eg. in a way which discards much of the information from one or other of the variants/paths), which adds up to exponential costs when there's several layers worth of such operations. I'm not sure how much possibility for optimization there is there though... might just be the unavoidable cost of such things!

@dktr0
Copy link
Contributor Author

dktr0 commented Nov 26, 2021

Starting another project with PureScript... I am curious how much of a difference that will make to the performance of otherwise comparable code. Nothing to report yet though...

@jwaldmann
Copy link
Contributor

jwaldmann commented Dec 1, 2021

Yes. Put on the iron Shirt, and chase Strings out of Earth.

I was looking into Haskell profiling (and visualising the profile) recently.
https://hackage.haskell.org/package/ghc-prof-flamegraph makes nice pictures.

What is the use case that we want to profile (i.e., where we suspect that Text vs String has an impact)? Can we add this to the performance benchmark?

I was doing this experiment:

stack bench tidal:bench:bench-speed --resolver=lts --profile
ghc-flame-graph bench-speed.prof  # produces .svg
profiteur bench-speed.prof  # produces .html

and the flame graph is ... ah, github does not want to inline SVG here, so I'm dropping the files at https://www.imn.htwk-leipzig.de/~waldmann/etc/Tidal/ (using several options for generating the graph).

[EDIT] I realize only now that you wrote profiterole while I thought profiteur, having read about it in this thread (two answers) https://mail.haskell.org/pipermail/haskell-cafe/2021-November/134902.html So I applied profiterole as well, and copied the files.

@dktr0
Copy link
Contributor Author

dktr0 commented Dec 1, 2021

For me, the use case I mostly care about is Estuary, which has not hitherto been so easy to profile for... In any case, one use case there where String to Text will probably make a difference is at the transition from Estuary's render engine (which, for Tidal, samples ControlPattern-s to produce events) to WebDirt. If everyone of those generated maps has Strings in it versus Text I think there's a fair bit of memory churn for the browser there. With the caveat that performance will be different in GHCJS in the browser, one could profile the case of sampling ControlPatterns and converting sample names to Text, both in the case where s patterns are fundamentally Pattern String (which needs to be pack-ed) and the case where s patterns are fundamentally Pattern Text (no need to pack). This is a cost that is paid repeatedly again and again in each moment that patterns are being played (unlike parsing which is only paid "once" per evaluation).

@yaxu
Copy link
Member

yaxu commented Dec 2, 2021

Yes beautiful images! I didn't look closely, but it doesn't seem to tell us very much, perhaps because we need more benchmark cases? Or perhaps we need to 'instrument' the code more somehow to give the benchmarking more information..
It would be nice to benchmark based on a 'real world' use of Tidal. Feedforward can do keystroke playback, but I suppose we just need code evaluations + time.
By the way I tried to run the benchmarks on my laptop via cabal, and at some point it just froze and I had to reboot. I'm not sure if it was memory or cpu that was locked. I'll look again later.

@jwaldmann
Copy link
Contributor

jwaldmann commented Dec 2, 2021

it doesn't seem to tell us very much, perhaps because we need more benchmark cases?

Yes that's my impression too. At the top of the profile, there's compressArc but it's not necessarily inefficient - it might just be called very often, by design of the benchmark.

Or perhaps we need to 'instrument' the code more somehow to give the benchmarking more information..

automatic instrumentation happens with ghc -prof -auto-all (or similar)
https://downloads.haskell.org/ghc/latest/docs/html/users_guide/profiling.html?highlight=profiling#ghc-flag--fprof-auto
I am not sure whether that's the exact option that was produced when stack called ghc, building the executable that produced the above graphs. I will re-check.

I tried to run the benchmarks on my laptop via cabal, and at some point it just froze

Same here. They eat a lot of memory, and perhaps profiling increases the footprint. I can run them on a machine with 32 GB, but not less. "just froze" - that's the dreaded "OOM killer" (someone should really use this as the name for their band) when the system runs out-of-memory, it will kill some process https://wiki.archlinux.org/title/Improving_performance#RAM,_swap_and_OOM_handling

A mitigation for that is to add +RTS -M15G -RTS options when calling the benchmark executable. If the benchmark eats too much RAM, it will be killed by GHC's RTS, before the OS knows about it. I'm not sure how to pass that via stack.

These are detailed commands to build and run benchmarks with explicit options

stack --resolver=lts build --bench --no-run-benchmarks --profile --ghc-options '-fprof-auto -rtsopts -O2'
.stack-work/dist/x86_64-linux/Cabal-3.2.1.0/build/bench-speed/bench-speed +RTS -P -H -M32G -RTS

the benchmark executable accepts some options (to put after -RTS), e.g., -m glob for selecting a subset of benchmarks but I can't get this working (cf. haskell/criterion#101 (comment))

Back to the original topic (Text vs. String) - meant to be run "in the browser": here it says that stack can be configured to use ghcjs https://docs.haskellstack.org/en/v1.4.0/ghcjs/
, so the same command for running benchmarks (on node) should work. Did anyone try? I gave up on installing ghcjs, cf. ghcjs/ghcjs#819

@yaxu
Copy link
Member

yaxu commented Sep 29, 2022

I'm not sure if this will make a huge difference, as there isn't much concatenation of strings in Tidal.. But I'm engaged in a fairly broad rewrite of tidal at the moment, so this could be a good moment to switch to Text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants