DSUtils.jl

A set of Data Science utilities in Julia for industry practitioners.

Installation

This package is not registered in Julia's general registry. You can add it via its URL:

] add https://github.com/DaymondLing/DSUtils.jl

Purpose

The intent of DSUtils is to make it easy for industry practitioners to get their work done. It provides functions for some routine parts of data science workflow so that people can focus on problem solving and not break their train of thought. The functions should just work and be reasonably performant on millions of rows of data.

There are no new capabilities per se, indeed, many of the functionalities are available in other packages. It is an end user package to get analysis done rather than a package for others to build packages with.

Current capabilities

Binary classifier performance evaluation
- kstest, 2 sample Kolmogorov-Smirnov point estimate and location
- auroc, Area Under Receiver Operatin Characteristics curve via concordance calculation rather than numeric integration
- bcdiag, wrapper for kstest and auroc that facilitates plotting the graphs below
- ksplot, plot of Kolmogorov-Smirnov separation
- rocplot, ROC plot
- biasplot, plot of actual response rate vs. predicted probability
- accuracyplot, plot of model accuracy given utility values for [TP, FN, FP, TN]
- liftcurve, actual and predicted lift curves
- cumliftcurve, cumulative actual and predicted lift curves
- liftable, actual and predicted lift tables
- cumliftable, cumulative actual and predicted lift tables
- infovalue, change in two frequency distributions
Binning
- ranks, equal density binning, tied values are always in the same bin
Dummy encoding
- onehot!, one hot encode a categorical variable, works with missing level and missing data, output indicator variables are in a DataFrame with named columns
Missing computation
- sumxm, sum a collection of scalars or vectors (element-wise) treating missing as 0

Project Status

This is work in progress, development is against the more recent versions of Julia, e.g., 1.4 and up.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
docs		docs
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

docs

docs

src

src

test

test

.gitignore

.gitignore

LICENSE

LICENSE

Manifest.toml

Manifest.toml

Project.toml

Project.toml

README.md

README.md

Repository files navigation

DSUtils.jl

Installation

Purpose

Current capabilities

Project Status

About

Releases 1

Packages

Contributors 2

Languages

License

DaymondLing/DSUtils.jl

Folders and files

Latest commit

History

Repository files navigation

DSUtils.jl

Installation

Purpose

Current capabilities

Project Status

About

Topics

Resources

License

Stars

Watchers

Forks

Languages