Grape Disease Detection

This project classifies diseases in grape plant using various Machine Learning classification algorithms. Grape plants are susceptible to various diseases The diseases that are classified in this project are:

Black rot
Black Measles (esca)
Powdery mildew
Leaf blight
Healthy

The Machine learning classification models in this project includes:

Random forest classification
Support vector machine classification
CNN - VGG16
CNN - Custom
Ensemble model - Majority voting
Ensemble model - Stacked prediction

Configuration of Project Environment

Clone the project.
Install packages required.
Download the data set
Run the project.

Setup procedure

Clone project from GitHub.
Change to the directory Grape-Disease-Classification.
Install packages
In order to reproduce the code install the packages
1. Manually install packages mentioned in requirements.txt file or use the command.
```
 pip install -r requirements.txt
```
2. Install packages using setup.py file.
```
  python setup.py install
```
The --user option directs setup.py to install the package in the user site-packages directory for the running Python. Alternatively, you can use the --home or --prefix option to install your package in a different location (where you have the necessary permissions)
Download the required data set.
The data set that is used in this project is available here. The data set includes images from kaggle grape disease data set and the images collected online and labelled using the LabelMe tool.
Download the zip file and extract the files in data/raw folder.

[OR]

Run the below command
```
 ./wgetgdrive.sh <drive_id> <zip_name>.zip
```
drive_id is 1gsUyWEkxz9H1-yn2ONx4scHg88kWU-38
Provide any zip_name.
Run the project.
See Documentation for the code section for further details.

Documentation for the code

Pre processing
This folder contains
1. Code to load the images and json(contains labelling information) files. This is present in preprocessing/001_load_data.py. To execute this code, within the 'preprocessing' folder enter the below command
```
 python 001_load_data.py
```
2. Augment data. The code is present in preproprocessing/002_data_augmentation.py. To execute, run the below command
```
  python 002_data_augmentation.py
```
  The data augmentation techniques used are
  - Horizontal flip
  - Vertical flip
  - Random rotation
  - Intensity scaling
  - Gamma correction
3. Extract histograms of feature descriptors. Feature descriptors are used to train only random forest and SVM. The code is present in preprocessing/003_hog.py
```
  python 003_hog.py
```
Models
This folder contains various models used in this project namely:
1. Random forest
2. Support vector machine
3. CNN - VGG16
4. CNN - Custom
5. Ensemble model - Majority voting
  In majority voting technique, output prediction is the one that receives more than half of the votes or the maximum number of votes. If none of the predictions get more than half of the votes or if it is a tie, we may say that the ensemble method could not make a stable prediction for this instance. In such a situation the prediction of the model with the highest accuracy is taken as the final output.
6. Ensemble model - Stacked prediction
  The network is trained with the array of probabilities from all 4 models.
The ensemble models are the aggregation of random forest, SVM, CNN-custom and CNN-VGG16.

The models can be trained by executing the below command within the models folder
```
   python <model_name>.py
```
visualization.py
This file contains all the visualization techniques used in this project.
1. Confusion matrix, using sns heat map with modifications to display details within each box.
2. Loss and Accuracy curves for Neural networks.
3. Tree representation for Random forest
4. ROC-AUC curves using Yellowbrick.
Usage is as follows
```
  python visualization.py -m <model_name> -t <one_visualization_technique>
```
For help on available models and visualization techniques
```
  python visualization.py --help
```
app.py
This file predicts the disease of the input image. Usage is as follows
```
  python app.py -m <model_name> -i <test_image_index>
```
for help on usage
```
  python app.py --help
```

Results

Below are the results obtained on the test set for various models trained in the project.

NOTE
The results obtained are system specific. Due to different combinations of the neural network cudnn library versions and NVIDIA driver library versions, the results can be slightly different. To the best of my knowledge, upon reproducing the environment, the ballpark number will be close to the results obtained.

Models	Accuracy (%)
Random forest	75.35
SVM	82.89
CNN - VGG16	93.62
Ensemble - Majority voting	98.05
Ensemble - Stacked prediction	98.23
CNN - Custom	98.76

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

docs

docs

models

models

preprocessing

preprocessing

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

app.py

app.py

requirements.txt

requirements.txt

setup.py

setup.py

visualization.py

visualization.py

Repository files navigation

Grape Disease Detection

Configuration of Project Environment

Setup procedure

Documentation for the code

Results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
docs		docs
models		models
preprocessing		preprocessing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
setup.py		setup.py
visualization.py		visualization.py

License

Sanjana7395/Grape-disease-classification

Folders and files

Latest commit

History

Repository files navigation

Grape Disease Detection

Configuration of Project Environment

Setup procedure

Documentation for the code

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Languages