Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimisation: some things seem a bit slower since 1.25 #2122

Open
simonmichael opened this issue Dec 2, 2023 · 8 comments
Open

optimisation: some things seem a bit slower since 1.25 #2122

simonmichael opened this issue Dec 2, 2023 · 8 comments
Labels
A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. impact4 Affects more than a few users. performance Anything performance-related (run time, memory usage, disk space..) severity2 Minor to moderate usability/doc bug, reasonably easy to avoid or tolerate. tools hledger developer tools, scripts, processes..

Comments

@simonmichael
Copy link
Owner

simonmichael commented Dec 2, 2023

I tend to see this (register much faster, but print and balance a bit slower) in recent releases:

$ make bench BENCHEXES=hledger-1.25,hledger-1.26,hledger-1.31,hledger-1.32 

run quick performance benchmarks in bench.sh (approximate, can be skewed):
Running 5 tests 1 times with 4 executables at 2023-12-02 10:11:59 HST:

Best times:
+----------------------------------------------------++--------------+--------------+--------------+--------------+
|                                                    || hledger-1.25 | hledger-1.26 | hledger-1.31 | hledger-1.32 |
+====================================================++==============+==============+==============+==============+
| -f examples/10000x1000x10.journal print            ||         0.62 |         0.69 |         0.81 |         0.77 |
| -f examples/10000x1000x10.journal register         ||         9.37 |         3.21 |         3.44 |         2.74 |
| -f examples/10000x1000x10.journal balance          ||         0.47 |         0.50 |         0.73 |         0.74 |
| -f examples/1000x1000x10.journal balance --weekly  ||         0.66 |         0.76 |         1.02 |         0.81 |
| -f examples/10000x1000x10.journal balance --weekly ||         9.31 |         9.04 |        10.34 |         9.63 |
+----------------------------------------------------++--------------+--------------+--------------+--------------+

More notably, right now on this machine I can see ~25k TPS with hledger 1.25 stats, but only ~15k TPS with 1.32. A new round of profiling could be in order.

Related:

@simonmichael simonmichael added A-WISH Some kind of improvement request, hare-brained proposal, or plea. performance Anything performance-related (run time, memory usage, disk space..) labels Dec 2, 2023
@simonmichael
Copy link
Owner Author

Are those all compiled with the same GHC version ? Actually no, those are built with 9.0.2, 9.2.2, 9.4.6 and 9.6.3 respectively. So it's not quite a fair test.

They are all ARM binaries, and all optimised builds (I'm fairly certain).

@simonmichael
Copy link
Owner Author

Profiling and some performance fixes were done as part of #2153. Summary: 1.32.3 is much faster than 1.29-1.32, but still slower than 1.25 (which is somewhat expected since it is providing more powerful features in the journal format).

$ for v in 1.25 1.26 1.28 1.29 1.30 1.31 1.32 1.32.3; do printf "$v:\t"; hledger-$v -f examples/10ktxns-10kaccts.journal stats | grep throughput; done
1.25:	Run time (throughput)    : 0.93s (10735 txns/s)
1.26:	Run time (throughput)    : 1.18s (8501 txns/s)
1.28:	Run time (throughput)    : 1.16s (8607 txns/s)
1.29:	Run time (throughput)    : 3.73s (2684 txns/s)
1.30:	Run time (throughput)    : 6.13s (1631 txns/s)
1.31:	Run time (throughput)    : 6.77s (1478 txns/s)
1.32:	Run time (throughput)    : 5.95s (1680 txns/s)
1.32.3:	Run time (throughput)    : 1.60s (6239 txns/s)

@simonmichael
Copy link
Owner Author

See #2153 for some recent benchmarking data. That issue fixed a major performance drop for unusually large numbers of accounts from 1.29 onward. On this one we'll continue discussing the remaining, lesser performance drop since 1.25. This one can be argued as a wish rather than a bug or regression since absence of (reasonable) performance fluctuations has never been promised. I think as part of resolving this issue though we should implement performance regression testing, and try to avoid or at least document any performance drops going forward.

@markokocic
Copy link
Collaborator

Well, it's true that performance was never guaranteed, but I'd still consider it a regression, since hledger became slower, not faster since 1.25. With my ledger file, with 17k transaction and 470 accounts, I get similar results as in your last comment, runtime for balance is somewhere around ~1,5 times slower in git head than in 1.25. It may not be much, but it's enough to cross the subjective threshold of annoyance.

I know that there have been a lot of features added to hledger since 1.25 that explain the general slowdown. But then, there is this philosophical question should a feature add its weight in terms of performance even if not used? Could I get a much faster stripped down version with only basic features for a daily use, and have slower execution only when I actually use some advanced or heavier functions?

As far as I'm concerned, I can keep using 1.25 for the next few years until my journal file grows enough for me to reach the annoyance threshold and have to revisit again. Or split the file and get on with the life.

@simonmichael
Copy link
Owner Author

simonmichael commented Mar 1, 2024

Noted @markokocic, thanks for your input as one of the (few?) users running large >10k-txn files. I agree it's arguable, and it's at least a bug in our process that it took so long to discover. But I feel like letting this one escape the dreaded black label on a technicality. There's no bounty involved since I reported it first so I'm assuming it's harmless to do so. Certainly it'll be considered a regression if it happens again, and certainly I hope we'll find ways to bring performance back to 1.25 level and beyond.

@simonmichael simonmichael added A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. tools hledger developer tools, scripts, processes.. and removed A-WISH Some kind of improvement request, hare-brained proposal, or plea. labels Mar 1, 2024
@simonmichael
Copy link
Owner Author

PS any ideas for simple robust automated performance testing ? I'm thinking about:

  • a functional test that saves elapsed time or throughput in a hidden file and fails if it's 10% slower than last time (and passes if there's no save file)
  • also running this as part of CI on github, assuming worker performance variance there isn't a problem
  • also running it on all platforms there as part of preparing release binaries, or on demand
  • possibly saving multiple runs as a log, which could be charted

@simonmichael
Copy link
Owner Author

PS I recently added memory usage to the stats output, that could be interesting to test also.

I experimented with printing CPU time rather than elapsed time but it seems unlikely to be needed.

@simonmichael simonmichael added severity2 Minor to moderate usability/doc bug, reasonably easy to avoid or tolerate. impact4 Affects more than a few users. labels Mar 2, 2024
@markokocic
Copy link
Collaborator

I don't think saving results to file would help, since the run results may vary across runs. Maybe have a script that will do benchmark against ledger, hledger-1.25 and previous release which can be run on demand or on a ci.

Btw, I just checked a new stats command. Maybe you want to split memory data to a separate line for a better readability.

Regarding speed, I finally managed to fix my journal to its format can be read directly by toht ledger and hledger, and here's the quick comparison with the latest from git.

marko@fine ~/s/cvstree> time hledger bal > /dev/null

________________________________________________________
Executed in    3.89 secs    fish           external
   usr time    3.60 secs    1.21 millis    3.60 secs
   sys time    0.31 secs    0.00 millis    0.31 secs

marko@fine ~/s/cvstree> time ledger bal > /dev/null

________________________________________________________
Executed in  382.45 millis    fish           external
   usr time  309.59 millis    1.46 millis  308.13 millis
   sys time   72.52 millis    0.49 millis   72.03 millis

I like hledger usability more, but wish it would be possible to approach ledger balance performance as close as possible :)

Also, I don't care if the ticket is a regression or whoever reported it, at least not in a sense of bounty, I'd only wish some performance back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. impact4 Affects more than a few users. performance Anything performance-related (run time, memory usage, disk space..) severity2 Minor to moderate usability/doc bug, reasonably easy to avoid or tolerate. tools hledger developer tools, scripts, processes..
Projects
None yet
Development

No branches or pull requests

2 participants