Avoid ordered_gradients? #84

maartenbreddels · 2018-12-18T20:04:09Z

Again, still digesting the code, but maybe you thought about this. Can ordered_gradients be avoided? Can we reorder in place in the gradient array, the tree grower can put it back in place at the end of .grow(). I see no place where the original order of gradient order matters except in gradient_boosting.py.

NicolasHug · 2018-12-18T20:31:53Z

Yes it could be avoided. It's only used to increase cache hit.

Can we reorder in place in the gradient array, the tree grower can put it back in place at the end of .grow().

Can you elaborate? I'm not sure I see how that would work out

One solution is to just never order the gradients and always use all_gradients.
We could just use the same code as in _build_histogram_root and pass sample_indices.

maartenbreddels · 2018-12-18T20:45:28Z

What I mean is that gradient (during the call to treegrower.grow) will be reordered in place, instead of having both gradients and ordered_gradients. It will give the same performance but require less memory. There is a small overhead for putting it back in the original order though, but that should be really small.

NicolasHug · 2018-12-18T21:06:25Z

But how precisely would you reorder the gradients array inplace, and then put it back in its original order without any additional data structure?

maartenbreddels · 2018-12-18T21:17:20Z

eventually, you'll have gradients = gradients[partition] right? You need to do the inverse of that. I'm not sure what the most elegant numpy way would be, but this would work:

gradient_original = zeros_like(gradients)
gradient_original[partition] = gradient

But it could be solved efficiently and inplace using numba I guess.

NicolasHug · 2018-12-18T21:40:26Z

But you are creating this new gradient_original array right? I thought the point was precisely not to allocate any new array.

you'll have gradients = gradients[partition] right?

Maybe that's what you meant but more correctly, you have
ordered_gradients == gradients[samples_indices] where samples_indices is a view on a specific region in partition.

The ordering of the gradients array created in fit() and passed down to SplittingContext never changes.

maartenbreddels · 2018-12-18T21:48:02Z

(from mobile phone)

On Tue, 18 Dec 2018, 22:40 Nicolas Hug ***@***.*** wrote: But you are creating this new gradient_original array right? I thought the point was precisely not to allocate any new array.

Yes, it more for explaining the idea, but it should all be done inplace.

you'll have gradients = gradients[partition] right? Maybe that's what you meant but more correctly, you have ordered_gradients == gradients[samples_indices] where samples_indices is a view on a specific region in partition.

Yes,ok then I understand

The ordering of the gradients array created in fit() and passed down to SplittingContext never changes.

Exactly, and I propose it does change order, so we don't need ordered_gradients. The only thing it requires is to change the order back

…

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#84 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABryPXjlnr3ec1U0-4redGfLdp6D78HKks5u6WDKgaJpZM4ZY_Hy> .

NicolasHug · 2018-12-18T22:08:23Z

To be more exact, since ordered_gradients is of size n_samples and samples_indices.shape[0] <= n_samples:

ordered_gradients[:samples_indices.shape[0]] == gradients[samples_indices]

I understand what you want to do, I just do not understand how you want to do it!

maartenbreddels mentioned this issue Dec 19, 2018

Failed attempt at ordered gradients/hessians #87

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid ordered_gradients? #84

Avoid ordered_gradients? #84

maartenbreddels commented Dec 18, 2018

NicolasHug commented Dec 18, 2018

maartenbreddels commented Dec 18, 2018

NicolasHug commented Dec 18, 2018

maartenbreddels commented Dec 18, 2018

NicolasHug commented Dec 18, 2018

maartenbreddels commented Dec 18, 2018 via email

NicolasHug commented Dec 18, 2018

Avoid ordered_gradients? #84

Avoid ordered_gradients? #84

Comments

maartenbreddels commented Dec 18, 2018

NicolasHug commented Dec 18, 2018

maartenbreddels commented Dec 18, 2018

NicolasHug commented Dec 18, 2018

maartenbreddels commented Dec 18, 2018

NicolasHug commented Dec 18, 2018

maartenbreddels commented Dec 18, 2018 via email

NicolasHug commented Dec 18, 2018