Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot very large decision tree #140

Open
Arnold1 opened this issue Oct 21, 2022 · 9 comments
Open

Plot very large decision tree #140

Arnold1 opened this issue Oct 21, 2022 · 9 comments

Comments

@Arnold1
Copy link

Arnold1 commented Oct 21, 2022

Hi,

I have a decision tree with 20k nodes. How can I plot it?

I checked the d3.js code but with svg its pretty slow to render 20k nodes and use some zoom with it.

is there a way to generate a graphviz too and convert it to a huge png so I can view it with https://leafletjs.com/?
or is there a way to draw the decision tree with d3 and canvas instead of svg?

@achoum
Copy link
Collaborator

achoum commented Oct 22, 2022

Hi,

There is currently no integrated display to graphviz. However, this should be easy to put in place. The model inspector gives you access to the tree structures. The inspector (and related data structures) is used by the tree plotter and the tree printer (printing the tree as text). What about calling manually the model inspector and populating a graphviz accordingly. For example:

inspector = model.make_inspector()
for tree in inspector.extract_all_trees()
  add_tree_to_graphviz_plot(tree)

If you get something polished, don't hesitate to add it to TF-DF contribs.

@Arnold1
Copy link
Author

Arnold1 commented Oct 22, 2022

is it possible to display 20k nodes with graphviz and add some zooming functionality as well in a html environment?

@rstz
Copy link
Collaborator

rstz commented Oct 25, 2022

The great Dtreeviz decision tree plotting library very recently got support for TF-DF. They have an iPython Notebook demonstrating how to use it. Let us know if this works for you.

Pinging @tlapusan who has been working on this.

@Arnold1
Copy link
Author

Arnold1 commented Oct 26, 2022

hi @tlapusan does Dtreeviz also work for tfdf.keras.RandomForestModel(task=tfdf.keras.Task.REGRESSION, with 25k nodes?

@tlapusan
Copy link

hi @Arnold1, definitely it will be a challenge :) I assume that for your big tree you have also a big training set.
One possible solution would be to use the parameter 'depth_range_to_display' and choose what tree levels you want to display, ex depth_range_to_display = (0, 10)

I'm just curious what insights would you like to get from such a big tree ? IMO is not very effective to look at a tree structure with so many nodes.

@achoum
Copy link
Collaborator

achoum commented Oct 26, 2022

@tlapusan has a good point. I would be interesting to know more about the use case.

In the meantime, you could try some generic graph visualization softwares (e.g., Gephi).
Looking at the raw trees might have limited interest though (Individual Random Forests trees are overfitted, and GBT trees cannot be understood individually). It is likely more interesting to look at some projections of those trees (some basic examples: feature interactions, proximity plots, cross-trees agreements, etc.).

@Arnold1
Copy link
Author

Arnold1 commented Feb 15, 2023

hi team - i see an update in here:
https://github.com/google/yggdrasil-decision-forests/releases/tag/1.3.0

Improve the display of decision tree structures.

how can i utilize that from the python side using this repo?

@rstz
Copy link
Collaborator

rstz commented Feb 16, 2023

Hi,
this refers to a change of the display of decision trees in ASCII, see here, so it is probably not relevant to you.

@Arnold1
Copy link
Author

Arnold1 commented Mar 5, 2023

hi, is there currently a way to generate a Layered violin plot with tensorflow decision forests?
example:
https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/Scatter%20Density%20vs.%20Violin%20Plot%20Comparison.html#Layered-violin-plot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants