Skip to content

A tool using Selenium for automatically download output from different Kaggle's kernel versions.

License

Notifications You must be signed in to change notification settings

phuc16102001/kaggle-output-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Kaggle output downloader

Introduction

Kaggle is a platform for build machine learning notebooks. It can be seen that people has used it to crawl data from websites, and scheduled it to run repeatly (e.g. daily, weekly).

However, the Kaggle API only has the API to download the latest output (not by versions). If the notebooks scheduled to run daily, download these output manually may require a huge cost. Because of that reason, this repository propose a tool to automatically fetch all the version of a Kaggle kernel (notebook).

Usage

Install required libraries

Firstly, you need to install some libraries:

pip install -r src/requirements.txt

Run without credential file

Run the script as following:

python src/kaggle-downloader.py \
    -u <username> \
    -e <email> \
    -p <password> \
    -n <notebook>

Run with credential file

Furthermore, you can provide the user's information with a file named credential.json with the following format:

{
  "username": "<username>",
  "password": "<password>",
  "email": "<email>"
}

Then, easily call the source as follow:

python src/kaggle-downloader.py \
    -c <credential_path>
    -n <notebook>

Contribution

This repository is owned by phuc16102001.

You are welcome to pull request, but please discuss with me for major changes.

License

MIT

About

A tool using Selenium for automatically download output from different Kaggle's kernel versions.

Topics

Resources

License

Stars

Watchers

Forks

Languages