Skip to content

Train a Deep CNN using images acquired automatically from google search with Selenium

License

Notifications You must be signed in to change notification settings

dimkastan/RaccoonsVsCats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RaccoonsVsCats- For those who are tired watching cats vs dogs examples

In this repo, I will demonstrate how you can use an Web Automation tool named Selenium in order to download images from Google Image Search.


After setting up your machine, you will perform the following two steps:
a. Download Images from Google Image search using selenium and Chrome Web-Driver
b. Train a custom classifier using PyTorch and the example provided [here](http://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html).
In this example I will show you how to download and train a network to discriminate between cats and racoons. However, you will be free to change the categories very easily.
*Please read the disclaimer before proceeding

Prerequisites

OS Version:
Ubuntu 14.04 LTS <br.>

Chrome broweser must be installed and the driver must be downloaded from here in order to be available from the Python Scrpt

You need to install Anaconda for Python 3.
My version is:

Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more informatio

Also the following packages are required:

  • python-magic
  • selenium
  • datetime
  • urllib
  • json
  • PyTorch

All packges except PyTorch are available via pip or anaconda install

Installing PyTorch:
You can follow the instructions provided here: http://pytorch.org/
I installed the GPU-Version using :
conda install pytorch torchvision cuda80 -c soumith


## Run the application

Assuming that you already have installed anaconda in Anaconda_path, run the following command:

${Anaconda_path}/python RaccoonVsCats.py

Enjoy!


Feel Free To contact me for any comments or suggestions.

TODO List

 TODO: 
 -- Add comments and improve code
 -- remove duplicate images as well as bad samples automatically 
 -- Add TensorBoard with PyTorch  
 -- Add TensorFlow model
 -- Deploy on a web-server 

Use it with any categories.

You can train your custom system by modifying the queries inside script:

queries=['Raccoon','Cat' ]
Attributes=['running']

Disclaimer:
1. Google images are subject to licensing. In case you want to use these images you should refer to the license of every particular image
2. This tool does not allow you to surpass these licensing options.
3. You will use this tool only for good purposes.
4. There is not guarantee that this projects meet your needs, or any guarantee of correctness.

About

Train a Deep CNN using images acquired automatically from google search with Selenium

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages