Skip to content
James Bergstra edited this page Mar 14, 2013 · 12 revisions

Welcome to the skdata (scikit-data) wiki!

This is not the main entry point for the project, that would be the skdata project home page.

Goal of Skdata

The goal of the skdata project is to standardize the representation of community benchmark data sets (including large and awkward ones), and facilitate the development of broadly applicable machine learning algorithm implementations. Skdata is meant to interoperate with other Python machine learning software (such as scikit-learn, PyBrain, or custom algorithms) but skdata does not aim to provide machine learning algorithms.

Status

  • The code of the library is currently usable (and frequently used).
  • The API should not be considered stable, it will probably remain a work-in-progress for some time.
  • There are tests for some but not all lines of code (estimated 50%).
  • Some of the older data set modules do not use the newer "dataset.py", "view.py" code layout, and it is on the TODO list to forward-port them.

Documentation