- Overview
- Data Preparation
- Convolutional Neural Network
- Results
- Improvements
- Birdex: Flask App
- Future Work
The data was pulled from the The Cornell Lab of Ornithology.
It is a collection of about 48,000 images and more than 400 species of birds observed in North America. Birds are separated by male, female or juvenile since they look quite different. Text files are also included that contains image file names and their corresponding labels.
So why identify birds?
Bird conservation is becoming increasingly crucial as Earth changes. Birds are an indicator species because they are highly sensitive to their surroundings and even the slightest change in temperature can alter their behaviour. Different species have various types of behaviour and pattern.
Due to the nature of birds, it is extremely difficult to track them all! This is why there are birders, using citizen science to help monitor them. Being able to identify the type of birds that are passing through certain areas can greatly help track their migratory patterns.
So why is bird conservation important? Check out this post by the American Bird Conservancy:
Why Bird Conservation is Important
Also, they're basically modern dinosaurs.
This Berkeley articles on why birds are dinosaurs (but also shows the skeptical side): Are Birds Really Dinosaurs?
Since there are many images, Amazon S3 came into play. The images are loaded into a bucket and stored in separated folders of the bird species. For this project, 21129 images will be used which includes 39 family groups of birds.
A function is written to retrieve the images from the S3 bucket while also resizing them, convert to array, and append to a list. This is due to the need for the input of the neural network to be numpy arrays.
Load and Resize Image Code
def resize_images_array(img_dir, folders, bucket):
# arrays of image pixels
img_arrays = []
labels = []
# loop through the dataframe that is linked to its label so that all images are in the same order
for folder in tqdm(folders):
s3 = boto3.client('s3')
enter_folder = s3.list_objects_v2(Bucket=bucket, Prefix=f'{img_dir}/{folder}')
for i in enter_folder['Contents'][2:]:
try:
filepath = i['Key']
obj = s3.get_object(Bucket=bucket, Key=f'{filepath}')
img_bytes = BytesIO(obj['Body'].read())
open_img = Image.open(img_bytes)
arr = np.array(open_img.resize((299,299)))
img_arrays.append(arr)
labels.append(folder)
except:
print(filepath) # get file_path of ones that fail to load
continue
return np.array(img_arrays), np.array(labels)
First, a brief explanation of the taxonomic hierarchy:
Every species belongs in a category of this hierarchy. It classifies species and group them up from broad cateogries to extremely specific groups. For example, humans belong to the Family of Hominidae, then Genus of Homo and the Species of Sapiens. "Human" is our common name!
The exploratory data analysis began with looking at the number of species in the Order group of the birds.
As seen in the plot, if the model were to predict the Order of the birds, there would be a huge inbalance in the dataset. Unfortunately, there are simply more birds included within the Passeriformes Order (perching birds, the largest order of birds). For example, the Leptosomiformes Order only contains one type of bird: Cuckoo Rollers!
Family is the next specific group in the taxonomic hierarchy. The model is predicting birds based on family groups so a count plot for the number of species in each family group is created.
The images have 3 different channels for the color which makes up the colors in the main image. The shape of the images are (299,299,3), the third one represent the number of channels. For greyscale, it'd be 1.
Let's check out some of the contestant within the data!
Contestant 1: Waterfowl | Contestant 2: Grosbeak | Contestant 3: Hawk |
---|---|---|
Here are the RGB Channels of three classes of birds seen in this dataset:
This first model was trained on a small subset (~3,000) of the total images(~40,000). This is mainly to test that the inputs of features and labels are correct. Errors did occur the very first run.
This is what the CNN layers look like generally:
After the first awful run, a simple model will be created using 3 types of birds: ducks, finches and hawks. This is to see if the amount of classes was causing the model to do so poorly. It will later be expanded to more.
The model is able to reach about 80% accuracy! However, accuracy can be misleading when there is an imbalance of the label groups. In this case, there is some imbalance classes because some family groups did not contain as many images as another.
Therefore, precision, recall, and F1-score is calculated as better metrics.
Precision: number of correct positives / total number of positives predicted, TP/(TP+FP)
Recall: number of correct positives / total number of positives actual, TP/(TP+FN)
F1-Score: harmonic mean of precision and recall, 2 * (precision*recall)/(precision + recall)
There are a few birds the model seem to have predicted poorly on! Let's take a look at some of them and what birds the model tend to confuse them with.
1. New World Sparrows(4) 41% vs Wood-Warblers(2) 15%
New World Sparrow | Wood-Warbler |
---|---|
New World Sparrows and Wood-Warblers seem to have their similarities: small bird, whitish belly, and brown stripes.
2. Kingfishers(35) 24% vs Tits, Chickadees, Titmice(20) 35%
Kingfisher | Titmouse | Chickadee |
---|---|---|
The model is confusing quite a bit of Kingfishers to the other group. Kingfisher and Titmouse do resemble each other in a way (spiked up mohawk). However, not too much from the allies of Titmouse, the Chickadee. THe color pattern do have SOME resemblance.
3. Wagtails and Pipits(33) 31% vs Nuthatches(37) 38%
Pipit | Pipit 2 | Nuthatch |
---|---|---|
The model is a bit confused here as well, labeling Wagtails and Pipits as more Nuthatches than the actual birds. Pipits and Nuthatches look quite different in the first image. Pipits are longer and have a tinier head. However, there are some images of Pipits that could resemble a nuthatch more. The setting of the birds are also similarly colored.
4. Cormorants and Anhingas(16) 11% vs Nuthatches(37) 48%
Anhingas | Nuthatch |
---|---|
Last but not least, the family groups with the biggest discrepancy. The Cormorants and Anhingas are both tall birds with a long neck. The Nuthatches are pretty much the opposite. There can definitely be more improvement here.
- One of the encountered issues with this dataset is the inbalance amount of images per family group. As seen in the bar plot above, the number of species within each family group is not evenly spread. This would cause the model to be able to recognize more of one species and not the other.
- Based on the birds that the model is getting confused with, there might be other features of the image the model is predicting on besides the birds themselves. The usage of SHAP or LIME can help determine what features/parts of the images the model is using to predict the family groups of the birds.
The web app allows users to upload images of birds and receive a prediction on which family group the bird belongs to! Other images were tested to see what it would turn out to be.
Note on the design: You may have noticed an image of a duck with oddly bright blue colored background on the page. I used the little fellow as a test image and got attached to them. This duck was with me through all my trouble shooting and successes, so they earned their place on my app!
Test a person in a bird costume
As you can see here, that is NOT a bird. However, given it is supposed to be a person in a hawk costume, the model is not too far off!
Test a non-bird: CAT
The model was not built to predict a label that is not part of the 39 family groups. Thus, it predicts a type of bird that the image relates to the most. The soft white belly and the brown surrounding furs can certainly make the model believe this cat is an owl!
- Better Model
- Transfer Learning
- SHAP/LIME
- Clean up files
- Object Detection