Skip to content

Handy scripts for building and augmenting a machine learning image dataset

Notifications You must be signed in to change notification settings

EdjeElectronics/Image-Dataset-Tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

This repository contains Python scripts for working with image datasets for machine learning object detection models. It contains the following scripts (more may be added later):

  1. PictureTaker
  2. FrameGrabber
  3. AutoLabeler

A brief description of each script is given below. The scripts themselves are in their own folders in this repository. The README.md file in each folder gives instructions on how to use the script.

PictureTaker

PictureTaker is a simple Python script for taking pictures with OpenCV and a connected camera. It makes it easy to collect images for training a machine learning vision model.

FrameGrabber

FrameGrabber is a tool for extracting individual frames from a video and saving them as an image. It allows you to quickly create training images from a video of objects that you want to train your model to detect.

This can be useful for building a training dataset by recording or finding videos of objects you want to detect. Then you can extract hundreds or thousands of images from the videos, and label them for training.

AutoLabeler

AutoLabeler is a tool for automatically labeling images using a model trained off only a small portion of your full dataset. You start by training a model with only a subset of the images (say, 10% of them), and then using that model to label the remaining immages. It automatically saves a label data file for each image. As it labels images, you supervise the labels to make sure they look correct. If a label is incorrect, you reject the label and manually label it yourself.

While you still have to manually accept or reject each label, it saves signficant time over labeling every image yourself. It can be a useful approach if you have over 1,000 images that need to be labeled. You can also stop and re-train the model during the process (say, after 25% of the images are labeled) to make it more accurate at labeling images.

About

Handy scripts for building and augmenting a machine learning image dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages