Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about modifiable local installation of scikit learn #233

Open
awyuan opened this issue Feb 27, 2024 · 4 comments
Open

Questions about modifiable local installation of scikit learn #233

awyuan opened this issue Feb 27, 2024 · 4 comments

Comments

@awyuan
Copy link

awyuan commented Feb 27, 2024

Hello,

I am trying to install a modifiable scikit-tree version locally to add some parameters to decision trees for a research project I am working on and I was having issues with locally install scikit-tree and being able to modify it and have it reflect in other code that calls on the functions. I'm unfamiliar with the process modifying python packages, so apologies if some of the questions are basic.

The first problem I am getting is trying to call any scikit-tree classes like _splitter or _criterion. When I run through the instructions given in the guideline, everything runs fine, and I can see that I've installed the right version of scikit-tree. However, whenever I go to try to from sktree.tree import _splitter, it is unable to find it (I've also tried sktree.tree._lib.sklearn.tree.splitter, but that also isn't found).

The second problem is actually maintaining the changes I make to the C files. Which command should I be running to compile the C code? It seems like whenever I run spin build, whatever changes I make locally are being overwritten.

Thanks for the help.

@adam2392
Copy link
Collaborator

adam2392 commented Feb 27, 2024

The development guide discusses how to install: https://github.com/neurodata/scikit-tree/blob/main/DEVELOPING.md. If there is any confusion, feel free to post some details.

As for your other questions regarding importing _splitter and _criterion, I don't think that is supported, so unsure what you're trying to do. Can you elaborate?

@awyuan
Copy link
Author

awyuan commented Feb 27, 2024

Hi Adam, thanks for responding. I essentially want to add hyperparameters to DecisionTrees to support an application I am building. Some of the hyperparams would occur in the splitter_ file and basically add additional conditions to the splitting process. I found this repo because it better abstracts these classes than scikit-learn does natively, but I'm having trouble getting these 3 steps to work:

  1. Install scikit-learn fork locally
  2. Add hyperparameters to the decision tree classifier and the splitter
  3. Be able to see the changes in a downstream classification task when importing the new classifier.

I've followed the steps in both the readme and developing files, but while I can import sktree.tree, I can't import any files below that level without running into errors. I was hoping to get a better sense on if I've done something fundamentally wrong while trying to get the tasks above to work.

@adam2392
Copy link
Collaborator

In order to modify the splitter, you need to use Cython. You should be able to import everything, and I can't replicate any error that you're describing unfortunately.

@awyuan
Copy link
Author

awyuan commented Mar 25, 2024

The issue has been solved, we were trying to import from within parent library causes issues. Installing it to a separate folder solved the issues we were having.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants