Skip to content

Commit

Permalink
Add profile viewer method (#229)
Browse files Browse the repository at this point in the history
* move viewer to within whylogs

* add profile viewer

* πŸ“š add documentation for viewer

* edits on profile viewer documentation

* remove dividers

* typos,and loom gif

* use thumbnail gif from loom

* Update README.md

Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>

* move metric collection list

* Remove dots

* Updating whylogs demo gif

* Update README.md

* Update README.md

* Update README.md

* bump version 0.4.8

Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Co-authored-by: Sam Gracie <4944259+samgracie@users.noreply.github.com>
  • Loading branch information
3 people committed May 18, 2021
1 parent 6ac241a commit 0d16a84
Show file tree
Hide file tree
Showing 49 changed files with 57 additions and 27 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.4.7-dev1
current_version = 0.4.8
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+)(?P<build>\d+))?
serialize =
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ src.python.pyc := $(shell find ./src -type f -name "*.pyc")
src.proto.dir := ./proto/src
src.proto := $(shell find $(src.proto.dir) -type f -name "*.proto")

version := 0.4.7-dev1
version := 0.4.8

dist.dir := dist
egg.dir := .eggs
Expand Down
56 changes: 38 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# whylogs: A Data and Machine Learning Logging Standard
<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">


[![License](http://img.shields.io/:license-Apache%202-blue.svg)](https://github.com/whylabs/whylogs-python/blob/mainline/LICENSE)
Expand Down Expand Up @@ -27,9 +26,6 @@ This is a Python implementation of whylogs. The Java implementation can be found
If you have any questions, comments, or just want to hang out with us, please join [our Slack channel](http://join.slack.whylabs.ai/).


<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">


- [Getting started](#getting-started)
- [Features](#features)
- [Data Types](#data-types)
Expand All @@ -39,7 +35,6 @@ If you have any questions, comments, or just want to hang out with us, please jo
- [Roadmap](#roadmap)
- [Contribute](#contribute)

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">

## Getting started<a name="getting-started" />

Expand All @@ -65,7 +60,6 @@ make install
make
```

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">

## Quickly Logging Data

Expand All @@ -90,22 +84,53 @@ with session.logger(dataset_name="my_dataset") as logger:
#images
logger.log_images("path/to/image.png")
```
whyLogs collects approximate statistics and sketches of data on a column-basis into a statistical profile. These metrics include:

whylogs collects approximate statistics and sketches of data on a column-basis into a statistical profile. These metrics include:

- Simple counters: boolean, null values, data types.
- Summary statistics: sum, min, max, variance.
- Summary statistics: sum, min, max, median, variance.
- Unique value counter or cardinality: tracks an approximate unique value of your feature using HyperLogLog algorithm.
- Histograms for numerical features. whyLogs binary output can be queried to with dynamic binning based on the shape of your data.
- Top frequent items (default is 128). Note that this configuration affects the memory footprint, especially for text features.

Check the examples below for visualization and other use cases

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">
### Multiple Profile Plots

To view your logger profiles you can use, methods within `whylogs.viz`:

```python
vizualization = ProfileVisualizer()
vizualization.set_profiles([profile_day_1, profile_day_2])
figure= vizualization.plot_distribution("<feature_name>")
figure.savefig("/my/image/path.png")
```

Individual profiles are saved to disk, AWS S3, or WhyLabs API, automatically when loggers are closed, per the configuration found in the Session configuration.

Current profiles from active loggers can be loaded from memory with:
```python
profile = logger.profile()
```

### Profile Viewer

You can also load a local profile viewer, where you upload the `json` summary file. The default path for the json files is set as `output/{dataset_name}/{session_id}/json/dataset_profile.json`.

```python
from whylogs.viz import profile_viewer
profile_viewer()
```

This will open a viewer on your default browser where you can load a profile json summary, using the `Select JSON profile` button:
Once the json is selected you can view your profile's features and
associated and statistics.

<img src="https://whylabs-public.s3-us-west-2.amazonaws.com/assets/whylogs-viewer.gif" title="whylogs HTML viewer demo">

## Documentation

The [documentation](https://docs.whylabs.ai/docs/) of this package is generated automatically.

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">
## Features

- Accurate data profiling: whylogs calculates statistics from 100% of the data, never requiring sampling, ensuring an accurate representation of data distributions
Expand All @@ -115,7 +140,7 @@ The [documentation](https://docs.whylabs.ai/docs/) of this package is generated
- Tiny storage footprint: whylogs turns data batches and streams into statistical fingerprints, 10-100MB uncompressed
- Unlimited metrics: whylogs collects all possible statistical metrics about structured or unstructured data

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">

## Data Types<a name="data-types" />
Whylogs supports both structured and unstructured data, specifically:

Expand All @@ -128,7 +153,7 @@ Whylogs supports both structured and unstructured data, specifically:
| Text | top k values, counts, cardinality (more in developement) | [Github Issue #213](https://github.com/whylabs/whylogs/issues/213) |
| Audio | In developement | [Github Issue #212](https://github.com/whylabs/whylogs/issues/212) |

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">

## Integrations

![current integration](images/integrations.001.png)
Expand All @@ -144,8 +169,6 @@ Whylogs supports both structured and unstructured data, specifically:
| Docker | Run whylogs as in Docker | <ul><li>[Rest Container](https://docs.whylabs.ai/docs/integrations-rest-container)</li></ul>|
| AWS S3 | Store whylogs profiles in S3 | <ul><li>[S3 example](https://github.com/whylabs/whylogs-examples/blob/mainline/python/S3%20example.ipynb)</li></ul>

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">

## Examples
For a full set of our examples, please check out [whylogs-examples](https://github.com/whylabs/whylogs-examples).

Expand All @@ -160,13 +183,10 @@ Check out our example notebooks with Binder: [![Binder](https://mybinder.org/bad

whylogs is maintained by [WhyLabs](https://whylabs.ai).

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">
## Community

If you have any questions, comments, or just want to hang out with us, please join [our Slack channel](http://join.slack.whylabs.ai/).

<img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250"><img align="center" src="images/Whylabs-Dots-Light-Bg.png" width="250">

## Contribute

We welcome contributions to whylogs. Please see our [contribution guide](https://github.com/whylabs/whylogs/blob/mainline/CONTRIBUTING.md) and our [developement guide](https://github.com/whylabs/whylogs/blob/mainline/DEVELOPMENT.md) for details.
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@
# built documents.
#
# The short X.Y version.
version = "0.4.7-dev1"
version = "0.4.8"
# The full version, including alpha/beta/rc tags.
release = "" # Is set by calling `setup.py docs`

Expand Down
Binary file added images/html_viewer.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/html_viewer_loaded.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "whylogs"
version = "0.4.7-dev1"
version = "0.4.8"
description = "Profile and monitor your ML data pipeline end-to-end"
authors = ["WhyLabs.ai <support@whylabs.ai>"]
license = "Apache-2.0"
Expand Down
2 changes: 1 addition & 1 deletion src/whylogs/_version.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""WhyLabs version number."""

__version__ = "0.4.7-dev1"
__version__ = "0.4.8"
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
6 changes: 2 additions & 4 deletions src/whylogs/viz/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
from .browser_viz import profile_viewer
from .visualizer import BaseProfileVisualizer, ProfileVisualizer

__ALL__ = [
ProfileVisualizer,
BaseProfileVisualizer,
]
__ALL__ = [ProfileVisualizer, BaseProfileVisualizer, profile_viewer]
12 changes: 12 additions & 0 deletions src/whylogs/viz/browser_viz.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import os
import webbrowser

_MY_DIR = os.path.realpath(os.path.dirname(__file__))


def profile_viewer():
"""
open a profile viewer loader on your default browser
"""
index_path = os.path.abspath(os.path.join(_MY_DIR, os.pardir, "viewer", "index.html"))
return webbrowser.open_new_tab(f"file:{index_path}")

0 comments on commit 0d16a84

Please sign in to comment.