Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling window with select function not behaving correctly #140

Open
e-tang opened this issue Mar 2, 2022 · 4 comments
Open

Rolling window with select function not behaving correctly #140

e-tang opened this issue Mar 2, 2022 · 4 comments
Labels
version-2 Saving this for Data-Forge version 2.

Comments

@e-tang
Copy link

e-tang commented Mar 2, 2022

When using rollingwindow with the select function, the rolling windows are correct but select function calls the transformer more than it should.

Please see the following test code:

import { assert, expect } from 'chai';
import 'mocha';
import * as dataForge from './index';
import { DataFrame, Series, ISeries, Index } from './index';

var df = new DataFrame({ index: [10, 20, 30, 40, 50], values: [0, 1, 2, 3, 4] });
var counter = 0;
var windows = df
.rollingWindow(2);

var testSeries = windows.select((window) => {
    console.log("counter:" + counter);
    return [window.getIndex().last(), ++counter];
})   
.withIndex(pair => pair[0])
.inflate(pair => pair[1]);

The following are the outputs:

counter:0
counter:1
counter:2
counter:3
counter:4
counter:5
counter:6
counter:7
counter:8
__index__
---------
20       
30       
40       
50  

I think only 4 times should be the correct number.

@e-tang
Copy link
Author

e-tang commented Mar 2, 2022

Based on the testing, particularly the first window got called twice

@e-tang
Copy link
Author

e-tang commented Mar 2, 2022

Tried to use map function on Series directly it turned out the same, the function got called more than it should. So it should be the bug in the iterator.

@e-tang
Copy link
Author

e-tang commented Mar 2, 2022

OKay, figured out why this happened.

After going through the source code, find out whenever the series is baked, the internal iterators are called and trigger the selector function. During this process, the selector function can be call numerous times from getColumnNames(), toArray(), and toPairs().

@ashleydavis
Copy link
Member

I believe this is due to Data-Forge's laziness and you need to write the transformer function so that it can execute without side effects.

I do realise this is a problem and I'm considering removing the the laziness in Data-Forge v2.

I've labelled this issue and will return to it for v2.

@ashleydavis ashleydavis added the version-2 Saving this for Data-Forge version 2. label Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
version-2 Saving this for Data-Forge version 2.
Projects
None yet
Development

No branches or pull requests

2 participants