Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(General inquiry) library differentiators #90

Open
LukeWood opened this issue Apr 26, 2024 · 2 comments
Open

(General inquiry) library differentiators #90

LukeWood opened this issue Apr 26, 2024 · 2 comments

Comments

@LukeWood
Copy link

Hello! Great work on yggdrasil - love the reference in the name.

I’ve been playing around with the library a bit and I was curious what the biggest feature differentiators of the library are from Xgboost/lgbm/others!

Would you say it’s the ability to leverage the tensor flow ecosystem? Anything in particular that you’ve found you “just get for free” by leveraging TF (maybe hardware acceleration is easy?).

Just generally curious, and hoping to learn more! Everything looks very cool and I love the idea to make decision forests in TF more streamlined.

cheers!

@rlcauvin
Copy link

One powerful use of the library is combining decision forest models with other Keras models. This documentation describes how you can "stack" a decision forest model on top of a pre-trained neural network model or combine several models (including a decision forest model) into an "ensemble" that averages their predictions.

@achoum
Copy link
Collaborator

achoum commented May 3, 2024

Hi Luke,

Tl;dr:

  • Ease of use and minimizing the risk of errors.
  • Integration with other machine learning frameworks.
  • Unique functionalities

As @rlcauvin mentioned, one of the values of YDF is its tight integration with the TensorFlow (TF) ecosystem. YDF models can run in TF Serving or be imported in TF JS, making it easy to use if you already have a TensorFlow pipeline.

Moreover, YDF is composable and aims to work well with other ML tools. For example, YDF models can be combined using TensorFlow, Keras 2, and Keras 3 using the TF backend. Work is also underway to integrate with JAX and some other surfaces.

This modularity notably allows for the creation of hybrid neural-network + decision forests models that can sometimes perform better than non-hybrid ones. For instance, the excellent sample efficiency of decision forests makes them suitable for merging signals from multiple models in complex pipelines. Fine-tuning decision forests alongside neural networks is another advanced technique being explored.

Regarding its unique features, YDF includes exact distributed training, oblique splits, example distance, and support for uplift modeling.

Finally, YDF simplifies development and productionization. For instance, model evaluation and understanding, two critical steps in decision forest productionization, are particularly easy with YDF. For example, calling "model.evaluate(dataset)" in Colab creates an interactive view with all the relevant metrics. Other methods like "model.analyze," "model.benchmark", "model.to_cpp", and "model.describe" further simplify the developer's life. Other features that aims to simplify the work of users and reduces the likelihood of mistakes have been described in the YDF paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants