Access Kaggle datasets

Project directory structure

├── README.md          Add some info about your project here
├── data               Your data files (ignored in version control) and datasets specifications lives here
│   ├── processed      Do not ever modify your raw data, do your preprocessing and store the final files here
│   ├── cache          Downloaded raw data files lives here
│   └── raw            Do not download any files here
│       ├── dataset1   
│       │   └── .toml  Specifications for downloading data
│       ├── dataset2
│       │   └── .csv   Actual dataset with links to binaries like images, audio files, etc... Use git-lfs if this file gets big enough
│       └── dataset3   
│           └── .json  Actual dataset with links to binaries like images, audio files, etc... Use git-lfs if this file gets big enough
├── logs               Your model training logs live here, so model training could be monitored by TensorBoard
├── notebooks          Your notebooks live here and their name indicate the order they should be ran. i.e. 01-explore-... and 02-clean-...
├── src                Your python files live here
│   ├── __init__.py
│   └── utils
│       ├── __init__.py
│       └── utils.py
├── serving            Your model serving classes lives here
│   └── example_model
│       ├── requirements.txt
│       └── user_model.py
├── scripts            Your shell scripts that perform various tasks live here 
│   └── cleanup.sh     Clear your cached data, logs, etc...
└── weights            gitignore this directory if you do not want to push your models to git

Using virtual environments

Conda or pipenv come pre-installed for easy quick use. We recommend using conda though.

Create a new environment

First you need to change your current directory to the one where you want your environment config to be tracked. If you want to use pipenv, initalize your environment by calling pipenv lock and if you want to use conda, run conda create --name ENV_NAME python=3.7 -y.

Make Jupyter aware of the new environment

If you are using conda:

conda activate ENV_NAME                   

ipython kernel install --user --name=ENV_NAME

If you are using pipenv, virtualenv, etc... install ipykernel in your new environment pip install ipykernel. And then python -m ipykernel install --user --name=ENV_NAME.

Remove unused kernels

First to find all the available kernel specs jupyter kernelspec list and then you can run jupyter kernelspec remove KERNAL_NAME

Access Kaggle datasets

First you need to install kaggle cli pip install kaggle --upgrade and then generate an API key by going to https://www.kaggle.com//account and select 'Create API Token'

export KAGGLE_USERNAME=username
export KAGGLE_KEY=xxxxxxxxx

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/raw/.empty		data/raw/.empty
embedding		embedding
notebooks		notebooks
scripts		scripts
serving/mnist		serving/mnist
weights		weights
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/raw/.empty

data/raw/.empty

embedding

embedding

notebooks

notebooks

scripts

scripts

serving/mnist

serving/mnist

weights

weights

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Project directory structure

Using virtual environments

Create a new environment

Make Jupyter aware of the new environment

Remove unused kernels

Access Kaggle datasets

About

Releases

Packages

Languages

ml-studio-app/ml-studio-demo-workspace

Folders and files

Latest commit

History

Repository files navigation

Project directory structure

Using virtual environments

Create a new environment

Make Jupyter aware of the new environment

Remove unused kernels

Access Kaggle datasets

About

Topics

Resources

Stars

Watchers

Forks

Languages