Skip to content

🌲 Decision Trees in scikit-learn with categorical and numerical data.

Notifications You must be signed in to change notification settings

j1nma/decision-trees

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Decision Trees

This work has two components:

  • Required is an implementation for sk-learn for a way to handle categorical data (and also mixed data, i.e. categorical and numerical). This implementation is hosted at this fork. Error rate, information gain and gini index are provided in each result tree.

  • Then, the solution is applied to the data sets provided.

  • To visualise the results, the output of the tree is reported in a graphical output.

The numeric parts of the user's matriculation number of the FH Technikum is used for deciding about the random seed for splitting your data into training & test set.

Installation

At least for macOS environment:

export CC=/usr/bin/clang        
export CXX=/usr/bin/clang++
export CPPFLAGS="$CPPFLAGS -Xpreprocessor -fopenmp"
export CFLAGS="$CFLAGS -I/usr/local/opt/libomp/include"
export CXXFLAGS="$CXXFLAGS -I/usr/local/opt/libomp/include"
export LDFLAGS="$LDFLAGS -L/usr/local/opt/libomp/lib -lomp"
export DYLD_LIBRARY_PATH=/usr/local/opt/libomp/lib
$  pip3 install -r requirements.txt

For dot files install graphviz:

Linux:

$  sudo apt-get install graphviz

MacOS: https://brewinstall.org/install-graphviz-on-mac-with-brew/

Running

$  python3 Exercise2.py

About

🌲 Decision Trees in scikit-learn with categorical and numerical data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages