Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add streaming functionality to Synergistic Forest #537

Merged
merged 26 commits into from
May 19, 2022

Conversation

nhahn7
Copy link

@nhahn7 nhahn7 commented Mar 10, 2022

Reference issue

#34 #501

Type of change

Feature Request

What does this implement/fix?

Add streaming functionality for forests with update_task function.

Additional information

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to remove the extra print statements~

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

black formatting is not complete.

@codecov
Copy link

codecov bot commented Mar 11, 2022

Codecov Report

Merging #537 (87fab73) into staging (af84f50) will decrease coverage by 13.40%.
The diff coverage is 8.75%.

@@             Coverage Diff              @@
##           staging     #537       +/-   ##
============================================
- Coverage    92.36%   78.95%   -13.41%     
============================================
  Files            7        7               
  Lines          419      499       +80     
============================================
+ Hits           387      394        +7     
- Misses          32      105       +73     
Impacted Files Coverage Δ
proglearn/progressive_learner.py 66.52% <8.06%> (-20.48%) ⬇️
proglearn/forest.py 74.19% <11.11%> (-25.81%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca15f3b...87fab73. Read the comment docs.

@PSSF23 PSSF23 mentioned this pull request Mar 11, 2022
Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add unit tests to fix the codecov issue. Also, replicate the change in 9aaa570 to solve the website rendering.

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the network.py changes if you don't plan to include the complete network update_task in this PR. I would recommend another PR for that later.

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nhahn7 you can include the XOR experiments in this PR. Or you can open another one for those.

@nhahn7
Copy link
Author

nhahn7 commented Mar 16, 2022

@PSSF23 Sounds good, I’ll work on adding unit tests and remove the network.py changes today. For the XOR experiments, the notebook currently is just used for plotting (not generating the results). Should I make it in to a tutorial of how to use update_task? Or leave it as is where it just plots from existing results in .txt files?

@PSSF23
Copy link
Member

PSSF23 commented Mar 16, 2022

Tutorial format is needed. You can focus on unit tests first.

nhahn7 and others added 3 commits March 16, 2022 19:49
Optimized sphinx version for website rendering and removed network.py changes
@nhahn7 nhahn7 changed the title ENH Add streaming functionality to ProgLearn ENH Add streaming functionality to Synergistic Forest Mar 17, 2022
@nhahn7
Copy link
Author

nhahn7 commented Mar 17, 2022

@PSSF23 It looks like its failing due to not recognizing the partial_fit function from sci-kit learn fork. Can this be resolved with changing requirements.txt to specify https://github.com/PSSF23/scikit-learn-stream instead of scikit-learn>=0.22.0?

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nhahn7 That would be too much of a dependency. Instead, we can state that the feature is experimental & modify the CircleCI checks.

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If classes is not currently defined as a variable, I would prefer to have the same name for all classes and input_classes.

Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update your CircleCI cache from v2 to v3 to resolve the error.

@nhahn7 nhahn7 requested a review from PSSF23 May 17, 2022 23:04
Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As currently there's no way to pass the partial_fit test, just comment out the test for now.

The notebooks and functions should be in experiments. And you did not add the notebook to the rst list.

@PSSF23 PSSF23 changed the base branch from staging to stream May 19, 2022 13:31
Copy link
Member

@PSSF23 PSSF23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We are not merging it to staging as the dependency on SDTF would decrease the code coverage. Hope that this will be resolved in the future.

@PSSF23 PSSF23 merged commit d44182c into neurodata:stream May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants