Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SHOW PR] Submodulev2 #44

Open
wants to merge 64 commits into
base: main
Choose a base branch
from
Open

[SHOW PR] Submodulev2 #44

wants to merge 64 commits into from

Conversation

adam2392
Copy link
Collaborator

@adam2392 adam2392 commented Jun 8, 2023

Reference Issues/PRs

Supersedes: #42

Incorporates the latest changes in scikit-learn upstream main, where missing-value support for trees was added.

What does this implement/fix? Explain your changes.

Any other comments?

adam2392 and others added 22 commits March 28, 2023 19:58
…ecessary refactorings to enable downstream functionality (#32)

#### Reference Issues/PRs
This is the most up-to-date PR branch to consolidate all proposed
refactor changes that work with:

- unsupervised trees
- oblique trees
- no performance/runtime regressions against main

#### What does this implement/fix? Explain your changes.
Incorporates refactors to:

Internal Cython of scikit-learn's:
- criterion
- splitter
- tree

Internals of Python in scikit-learns:
- python Tree

Adds the basic implementation of oblique trees. The implementation of
oblique trees has been tested on all sklearn's `check_estimator` testing
function and has error-checking bounds for the new hyperparameter
introduced, which is `feature_combinations` that defaults to ``min(1.5,
n_features)``.

TODO:
1. [ ] ~Add honest support for trees (splitting the data at the Python
API level)~
2. [x] Build wheels
3. [ ] ~Brainstorm unit-tests, or weekly checks to determine when our
fork is out-of-date compared to upstream sklearn~
4. [x] Revamp README for the fork

#### Any other comments?

[cd build]

---------

Signed-off-by: Adam Li <adam2392@gmail.com>
Co-authored-by: Chester Huynh <chester.huynh924@gmail.com>
Co-authored-by: Parth Vora <pvora4@jhu.edu>
#### Reference Issues/PRs
Fixes README and wheel building


---------

Signed-off-by: Adam Li <adam2392@gmail.com>
<!--
Thanks for contributing a pull request! Please ensure you have taken a
look at
the contribution guidelines:
https://github.com/scikit-learn/scikit-learn/blob/main/CONTRIBUTING.md
-->

#### Reference Issues/PRs
<!--
Example: Fixes scikit-learn#1234. See also scikit-learn#3456.
Please use keywords (e.g., Fixes) to create link to the issues or pull
requests
you resolved, so that they will automatically be closed when your pull
request
is merged. See
https://github.com/blog/1506-closing-issues-via-pull-requests
-->


#### What does this implement/fix? Explain your changes.


#### Any other comments?


<!--
Please be aware that we are a loose team of volunteers so patience is
necessary; assistance handling other issues is very welcome. We value
all user contributions, no matter how minor they are. If we are slow to
review, either the pull request needs some benchmarking, tinkering,
convincing, etc. or more likely the reviewers are simply busy. In either
case, we ask for your understanding during the review process.
For more information, see our FAQ on this topic:

http://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.

Thanks for contributing!
-->

---------

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 changed the title Submodulev2 [SHOW PR] Submodulev2 Jun 8, 2023
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@github-actions
Copy link

github-actions bot commented Jun 23, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 6ec023b. Link to the linter CI: here

Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392
Copy link
Collaborator Author

1994f15 introduces a regression where the missing values are not going to the node with the most nodes.

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
<!--
Thanks for contributing a pull request! Please ensure you have taken a
look at
the contribution guidelines:
https://github.com/scikit-learn/scikit-learn/blob/main/CONTRIBUTING.md
-->

#### Reference Issues/PRs
<!--
Example: Fixes scikit-learn#1234. See also scikit-learn#3456.
Please use keywords (e.g., Fixes) to create link to the issues or pull
requests
you resolved, so that they will automatically be closed when your pull
request
is merged. See
https://github.com/blog/1506-closing-issues-via-pull-requests
-->


#### What does this implement/fix? Explain your changes.


#### Any other comments?


<!--
Please be aware that we are a loose team of volunteers so patience is
necessary; assistance handling other issues is very welcome. We value
all user contributions, no matter how minor they are. If we are slow to
review, either the pull request needs some benchmarking, tinkering,
convincing, etc. or more likely the reviewers are simply busy. In either
case, we ask for your understanding during the review process.
For more information, see our FAQ on this topic:

http://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.

Thanks for contributing!
-->

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392
Copy link
Collaborator Author

As of: 9a614f4

the categorical is a bit more complicated because the stuff in sklearn/tree/_splitter.pyx was not completed.

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Merging latest changes from sklearn main

#### What does this implement/fix? Explain your changes.


#### Any other comments?


<!--
Please be aware that we are a loose team of volunteers so patience is
necessary; assistance handling other issues is very welcome. We value
all user contributions, no matter how minor they are. If we are slow to
review, either the pull request needs some benchmarking, tinkering,
convincing, etc. or more likely the reviewers are simply busy. In either
case, we ask for your understanding during the review process.
For more information, see our FAQ on this topic:

http://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention.

Thanks for contributing!
-->

---------

Signed-off-by: Adam Li <adam2392@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant