Skip to content

cwong690/Birdex

Repository files navigation

Birds Collage

BIRDEX

Web based Flask app to predict the family group of birds from images using transfer learning.

badge badge

Table of Contents

Overview

The data was pulled from the The Cornell Lab of Ornithology.
It is a collection of about 48,000 images and more than 400 species of birds observed in North America. Birds are separated by male, female or juvenile since they look quite different. Text files are also included that contains image file names and their corresponding labels.

So why identify birds?

Bird conservation is becoming increasingly crucial as Earth changes. Birds are an indicator species because they are highly sensitive to their surroundings and even the slightest change in temperature can alter their behaviour. Different species have various types of behaviour and pattern.

Due to the nature of birds, it is extremely difficult to track them all! This is why there are birders, using citizen science to help monitor them. Being able to identify the type of birds that are passing through certain areas can greatly help track their migratory patterns.

So why is bird conservation important? Check out this post by the American Bird Conservancy:

Why Bird Conservation is Important

Also, they're basically modern dinosaurs.

birdfam

This Berkeley articles on why birds are dinosaurs (but also shows the skeptical side): Are Birds Really Dinosaurs?

Data Preparation

Since there are many images, Amazon S3 came into play. The images are loaded into a bucket and stored in separated folders of the bird species. For this project, 21129 images will be used which includes 39 family groups of birds.

A function is written to retrieve the images from the S3 bucket while also resizing them, convert to array, and append to a list. This is due to the need for the input of the neural network to be numpy arrays.

Load and Resize Image Code
def resize_images_array(img_dir, folders, bucket):
    # arrays of image pixels
    img_arrays = []
    labels = []
    
    # loop through the dataframe that is linked to its label so that all images are in the same order
    for folder in tqdm(folders):
        s3 = boto3.client('s3')
        enter_folder = s3.list_objects_v2(Bucket=bucket, Prefix=f'{img_dir}/{folder}')
        for i in enter_folder['Contents'][2:]:
            try:
                filepath = i['Key']
                obj = s3.get_object(Bucket=bucket, Key=f'{filepath}')
                img_bytes = BytesIO(obj['Body'].read())
                open_img = Image.open(img_bytes)
                arr = np.array(open_img.resize((299,299)))
                img_arrays.append(arr)
                labels.append(folder)
            except:
                print(filepath) # get file_path of ones that fail to load
                continue

    return np.array(img_arrays), np.array(labels)

Birds

First, a brief explanation of the taxonomic hierarchy:

taxonomic hierarchy

Every species belongs in a category of this hierarchy. It classifies species and group them up from broad cateogries to extremely specific groups. For example, humans belong to the Family of Hominidae, then Genus of Homo and the Species of Sapiens. "Human" is our common name!

The exploratory data analysis began with looking at the number of species in the Order group of the birds.

order countplot

As seen in the plot, if the model were to predict the Order of the birds, there would be a huge inbalance in the dataset. Unfortunately, there are simply more birds included within the Passeriformes Order (perching birds, the largest order of birds). For example, the Leptosomiformes Order only contains one type of bird: Cuckoo Rollers!

Family is the next specific group in the taxonomic hierarchy. The model is predicting birds based on family groups so a count plot for the number of species in each family group is created.

fam countplot

Family Plot Legend fam plot legend

The images have 3 different channels for the color which makes up the colors in the main image. The shape of the images are (299,299,3), the third one represent the number of channels. For greyscale, it'd be 1.

Let's check out some of the contestant within the data!

Contestant 1: Waterfowl Contestant 2: Grosbeak Contestant 3: Hawk

Here are the RGB Channels of three classes of birds seen in this dataset:


RGB images

Convolutional Neural Network

This first model was trained on a small subset (~3,000) of the total images(~40,000). This is mainly to test that the inputs of features and labels are correct. Errors did occur the very first run.

Shape of training sets and testing sets. data shapes

The model is pretty weak. weak model metrics

This is what the CNN layers look like generally:

CNN Code CNN Code

After the first awful run, a simple model will be created using 3 types of birds: ducks, finches and hawks. This is to see if the amount of classes was causing the model to do so poorly. It will later be expanded to more.

Simple CNN Model

CNN Model Epochs CNN Model epochs
CNN Model Accuracy/Loss Plots CNN Model acc/loss plots
CNN Model Confusion Matrix CNN Model conf_mat
After a few runs, it finally captured the finches!

Transfer Learning using Xception Model

Model Summary Model Summary
Model Epochs Model epochs

Results:

Model Accuracy/Loss Plots Model acc plots Model loss plots

The model is able to reach about 80% accuracy! However, accuracy can be misleading when there is an imbalance of the label groups. In this case, there is some imbalance classes because some family groups did not contain as many images as another.


Therefore, precision, recall, and F1-score is calculated as better metrics.

model metrics


Precision: number of correct positives / total number of positives predicted, TP/(TP+FP)

Recall: number of correct positives / total number of positives actual, TP/(TP+FN)

F1-Score: harmonic mean of precision and recall, 2 * (precision*recall)/(precision + recall)


Precision, Recall, F1-Score per Family per group metrics The precision, recall and F1-score per family group

Confusion Matrix

Model Confusion Matrix Model conf_mat

There are a few birds the model seem to have predicted poorly on! Let's take a look at some of them and what birds the model tend to confuse them with.

1. New World Sparrows(4) 41% vs Wood-Warblers(2) 15%

New World Sparrow Wood-Warbler
sparrow woodwarb

New World Sparrows and Wood-Warblers seem to have their similarities: small bird, whitish belly, and brown stripes.


2. Kingfishers(35) 24% vs Tits, Chickadees, Titmice(20) 35%

Kingfisher Titmouse Chickadee
kingfisher titmouse chickadee

The model is confusing quite a bit of Kingfishers to the other group. Kingfisher and Titmouse do resemble each other in a way (spiked up mohawk). However, not too much from the allies of Titmouse, the Chickadee. THe color pattern do have SOME resemblance.


3. Wagtails and Pipits(33) 31% vs Nuthatches(37) 38%

Pipit Pipit 2 Nuthatch
pipit pipit2 nuthatch

The model is a bit confused here as well, labeling Wagtails and Pipits as more Nuthatches than the actual birds. Pipits and Nuthatches look quite different in the first image. Pipits are longer and have a tinier head. However, there are some images of Pipits that could resemble a nuthatch more. The setting of the birds are also similarly colored.


4. Cormorants and Anhingas(16) 11% vs Nuthatches(37) 48%

Anhingas Nuthatch
anhinga nuthatch

Last but not least, the family groups with the biggest discrepancy. The Cormorants and Anhingas are both tall birds with a long neck. The Nuthatches are pretty much the opposite. There can definitely be more improvement here.


Improvements

  1. One of the encountered issues with this dataset is the inbalance amount of images per family group. As seen in the bar plot above, the number of species within each family group is not evenly spread. This would cause the model to be able to recognize more of one species and not the other.
  2. Based on the birds that the model is getting confused with, there might be other features of the image the model is predicting on besides the birds themselves. The usage of SHAP or LIME can help determine what features/parts of the images the model is using to predict the family groups of the birds.

Birdex: Flask App

The web app allows users to upload images of birds and receive a prediction on which family group the bird belongs to! Other images were tested to see what it would turn out to be.

Note on the design: You may have noticed an image of a duck with oddly bright blue colored background on the page. I used the little fellow as a test image and got attached to them. This duck was with me through all my trouble shooting and successes, so they earned their place on my app!

Bird Flask gif bird flask gif

Test a person in a bird costume hawk costume
As you can see here, that is NOT a bird. However, given it is supposed to be a person in a hawk costume, the model is not too far off!
Test a non-bird: CAT Model conf_mat
The model was not built to predict a label that is not part of the 39 family groups. Thus, it predicts a type of bird that the image relates to the most. The soft white belly and the brown surrounding furs can certainly make the model believe this cat is an owl!

Future Work

  • Better Model
  • Transfer Learning
  • SHAP/LIME
  • Clean up files
  • Object Detection

About

Bird Species Image Classification Web App using Transfer Learning and Flask

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published