Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum_gradient and sum_hessians computation in find_node_split_subtraction #90

Open
NicolasHug opened this issue Jan 16, 2019 · 4 comments
Labels
perf Computational performance issue or improvement

Comments

@NicolasHug
Copy link
Collaborator

NicolasHug commented Jan 16, 2019

Instead of summing over all the bins of an arbitrary feature histogram to compute
context.sum_gradients and context.sum_hessians, we can instead directly pass those values to find_node_split_subtraction since the parent's split_info already contains gradient_left/right and hessian_left/right.

We're summing over 255 bins max so the gain is very minimal, but it would be clearer, and more accurate.

@NicolasHug NicolasHug added the perf Computational performance issue or improvement label Jan 16, 2019
@ogrisel
Copy link
Owner

ogrisel commented Jan 28, 2019

I am not sure I understand. Feel free to submit a PR with the change to be able to directly see the diff.

@NicolasHug
Copy link
Collaborator Author

I'm proposing what I have done in scikit-learn/scikit-learn#12807

grower._compute_spittability() will look like this:

            if node.hist_subtraction:
                if node is node.parent.right_child:
                    sum_gradients = node.parent.split_info.gradient_right
                    sum_hessians = node.parent.split_info.hessian_right
                else:
                    sum_gradients = node.parent.split_info.gradient_left
                    sum_hessians = node.parent.split_info.hessian_left
                split_info = self.splitter.find_node_split_subtraction(
                    node.sample_indices,
                    sum_gradients, sum_hessians, node.parent.histograms,
                    node.sibling.histograms, histograms)
            else:
                split_info = self.splitter.find_node_split(
                    node.sample_indices, histograms)

and we can remove these lines in the find_node_split_subtraction:

    # We can pick any feature (here the first) in the histograms to
    # compute the gradients: they must be the same across all features
    # anyway, we have tests ensuring this. Maybe a more robust way would
    # be to compute an average but it's probably not worth it.
    context.sum_gradients = (parent_histograms[0]['sum_gradients'].sum() -
                             sibling_histograms[0]['sum_gradients'].sum())

    n_samples = sample_indices.shape[0]
    if context.constant_hessian:
        context.sum_hessians = \
            context.constant_hessian_value * float32(n_samples)
    else:
        context.sum_hessians = (parent_histograms[0]['sum_hessians'].sum() -
sibling_histograms[0]['sum_hessians'].sum())

@ogrisel
Copy link
Owner

ogrisel commented Feb 18, 2019

Ok +1.

@NicolasHug
Copy link
Collaborator Author

Actually as we just discussed, we don't need to recompute sum_gradients and sum_hessians regardless of the histogram computation method that is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf Computational performance issue or improvement
Projects
None yet
Development

No branches or pull requests

2 participants