-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running time exponential in number of operators on stream #127
Comments
Why not just You are calculating There is also moving window and moving sum (plus other simple stats) if you need this and not just lag. #108 |
It is a rather contrived example. We just want to see how speed scales with applying operators repeatedly on a sequence. Effectively building a large but simple graph of operations that resembles, in number of operations, a more complex real-world strategy. |
Even chained It is natural that the system takes more time to performs more operations. I would be surprise if performance drop is really exponential, could you share your numbers please? Or maybe the simple case |
Actually lagged cursor is quite complex now and keeps track of current and lagged position, instead of just an offset by N, so it require two cursor moves. Nesting such lags will lead to exponential time growth by current construction. It should be simple to create a Lag projection that just uses a single cursor with offset by N, I believe I had such implementation but refactored and optimized most of the cursor code logic and reimplemented projections in generic way at first. There are many optimization opportunities in terms of cursor moves design. I was focusing on allocations and method inlining first. |
It seems to no just be lagged, getting the same exponential slowdown when implementing lag in terms of scan thus:
Again, I know this is a really silly way of doing this, but just want to see how things scale with graph size. We have currently compared this against building on top of Reactive.jl, which seems to behave non-exponentially. |
Do you have absolute numbers for your benchmarks? |
I really do not understand from your code what are you trying to achieve? How comes that Scan replaces Lag? Also, since you are using the built-in F# Option type then performance is probably your last concern... Option is reference type, if you using it for numerous data points its overhead and allocations will be more visible than Spreads' ones. |
Times are: As mentioned above we are "stupidly" implementing simple strategies that have the same graph size as eventual more complex strategies to see how Spreads performs and scale. Initially we wanted to test using the |
Window is by far the most useful operator, have you tried it? I optimized
it carefully first. Also Spreads is not only Rx, it has unique Zip feature
I haven't seen in other places yet. Zip is the most useful thing for real
irregular data with different time frames.
…On Jan 24, 2018 9:53 PM, "TomNicholson" ***@***.***> wrote:
Times are:
lag_size,time_in_seconds
1,0.065s
2,0.064s
3,0.064s
4,0.064s
5,0.064s
6,0.065s
7,0.066s
8,0.066s
9,0.070s
10,0.077s
11,0.087s
12,0.124s
13,0.193s
14,0.377s
15,0.752s
16,1.706s
17,3.488s
18,7.744s
19,17.289s
20,36.141s
21,90.253s
As mentioned above we are "stupidly" implementing simple strategies that
have the same graph size as eventual more complex strategies to see how
Spreads performs and scale.s Initially we wanted to test the Lag operator
of Series, but encountering the exponential nature of it, we decided to
replace it with a Scan based version just for benchmarking.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#127 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAd5fK92LPIlWxmmFUHbmEyJcrNqdD7oks5tN3wngaJpZM4RqK1d>
.
|
I am currently evaluating Spreads (and therefore implicitly F#) for potential use in our core business logic.
In trying to see how runtime scales with the number of operations on a stream, I am simple applying the lag operator to a series. My code is roughly as follows:
For different values of
num_lag
I notice an exponential scaling of run time.Am I doing something stupid? Or is this an inherent problem with Spreads?
Thanks
The text was updated successfully, but these errors were encountered: