Toy Application

This is a simple exercise demonstrating serving RESTful web APIs for a machine learning model - in this case, a simple HaarCascade algorithm for object detection.

Requirements

Background:

The software team is currently focused on developing applications for machine vision related products. The applications are meant to be deployed at client sites. However, as part of marketing efforts, parallel versions of the applications are deployed on the cloud to help the company reach a larger audience and expand the client base.

The challenge:

At the heart of many machine vision solutions is a software package named OpenCV. In order to test applicant adaptability to one of many software packages used by the team, this challenge involves a toy backend application.

Create a simple toy backend that can be tested using CURL or Postman that will integrate a simple openCV application. The API endpoint or endpoints must handle all the inputs required by the application. Link to the openCV application can be found here: https://www.geeksforgeeks.org/detect-an-object-with-opencv-python/

Submission Guidelines:

Package the submission as a container project.
Submit this toy application marked as a private repo in a git repository.
Be ready to show a good working demo.
Feel free to discuss any detail related to the task prior to submission.
Stretch Goals (optional):
- Expose a swagger UI of your backend endpoints.
- Design it to scale for hundreds of requests per second
Questions:
- Please walk us through your solution and explain your approach
- Why did you take this approach?
- If you were to complete the stretch goals, what would be your approach and what steps would you take?
- What are the issues you think the team would encounter in relation to the development of machine vision products?
- How would you solve these issues?

Design

This solution uses microservices architecture instead of using serverless architecture for better control and flexibility - can run anywhere, any cloud, not tied up to AWS. Currently, there are only 2 containers - REST API microservice and NGINX microservice.

It uses the following tech stack:
- Flask - for API development
- gUnicorn - for WSGI server; refer to toy_application\restapi\src\wsgi.py
- Docker - for microservice containerization
- Docker-compose - for building and running the docker containers; refer to toy_application\docker-compose.yml
- Nginx - web server for API server, SSL offloading (and load balancing when using multiple nodes); refer to toy_application\nginx
- unittest - for API unit testing
- curl - for API system testing
- Swagger OpenAPI - for API documentation
- AWS EC2 - for running the containers, uses Amazon Linux AMI 2 OS
- AWS Route53 - for routing richmondu.com to the EC2 instance
- GoDaddy - for certificates for richmondu.com
Automated build and deployment is setuped using Jenkins pipeline:
- Jenkins - for automated build and deployment (CI/CD); refer to toy_application\Jenkinsfile
- Github - for source code repository
- Jenkins has been setuped to download from get code from Github and then build and deploy to AWS EC2.
Below are the APIs:
- Upload image POST /api/v1/objectdetection/image
- Download image GET /api/v1/objectdetection/image/{id}
- Delete image DELETE /api/v1/objectdetection/image/{id}
- Download processed image GET /api/v1/objectdetection/image/{id}/processed
- Delete processed image DELETE /api/v1/objectdetection/image/{id}/processed
- To view using Swagger UI, go to https://petstore.swagger.io/, then use https://hacarustoyapplication.s3.amazonaws.com/swagger_openapi.json and click Explore.

Testing

Data Augmentation

In order to produce more test data, it is necessary to do data augmentation. Data augmentation means deriving new images from the test images via rotation, blurring, transformation, scaling, etc. This technique is often useful in computer vision and machine learning projects as collecting data is often challenging.

System testing

via Swagger Hub

Go to https://app.swaggerhub.com/apis-docs/richmondu/toy-application/1.0.0
Select http://127.0.0.1:8000 localhost or https://richmondu.com
Test Upload image

(Click Try it out -> Click Choose File button -> Select file to upload ex. image.jpg -> Click Execute -> Response should be ok)
Test Download image

(Click Try it out -> Click Choose File button -> Select file to upload ex. image.jpg -> Click Execute -> Response should be ok)
Test Download processed image

(Click Try it out -> Click Choose File button -> Select file to upload ex. image.jpg -> Click Execute -> Response should be ok)

via Swagger UI (https://petstore.swagger.io/)

Copy https://hacarustoyapplication.s3.amazonaws.com/swagger_openapi.json to the Swagger UI Explore input then click on Explore button

via Swagger Editor (https://editor.swagger.io/)

Copy https://hacarustoyapplication.s3.amazonaws.com/swagger_openapi.yaml to the Swagger Editor

via test_curl_upload.bat

This uses the images in test_images\input
curl -X POST http://127.0.0.1:8000/api/v1/objectdetection/image -F "image=@image.jpg"
curl -X POST http://127.0.0.1:8000/api/v1/objectdetection/image -F "image=@image__blur4.0.jpg"
curl -X POST http://127.0.0.1:8000/api/v1/objectdetection/image -F "image=@image__fliph.jpg"
curl -X POST http://127.0.0.1:8000/api/v1/objectdetection/image -F "image=@image__rot180.jpg"
curl -X POST http://127.0.0.1:8000/api/v1/objectdetection/image -F "image=@image__zoom200_0_300_300.jpg"

via test_curl_download.bat

This downloads the processed images in test_images\output
curl -o image.jpg http://127.0.0.1:8000/api/v1/objectdetection/image/image.jpg/processed
curl -o image__blur4.jpg http://127.0.0.1:8000/api/v1/objectdetection/image/image__blur4.jpg/processed
curl -o image__fliph.jpg http://127.0.0.1:8000/api/v1/objectdetection/image/image__fliph.jpg/processed
curl -o image__rot180.jpg http://127.0.0.1:8000/api/v1/objectdetection/image/image__rot180.jpg/processed
curl -o image__zoom200_0_300_300.jpg http://127.0.0.1:8000/api/v1/objectdetection/image/image__zoom200_0_300_300.jpg/processed

Unit testing

test.py

Test the logic
uses test_images/input/*

Augmentation was done on image.jpg to produced several images (rotated, blurred, flipped, transformed, etc)
outputs result to test_images/output/*

Same filename as in input

test_api.py

Test the APIs

Points for improvement:

File name conflicts

Generating ids instead of filename as id will prevent conflicts from multiple users. Adding user sessions will also prevent that issue from occuring. Saving src and dst path of images using a database will also fix that problem.
File storage

Copy data to Amazon S3 not in local file system. Currently, everything is stored in the file system.
Performance

Use FastAPI with uvicorn (instead of Flask with gUnicorn) for faster performance (with async/await for concurrency). Caching for images processed using Redis database will also help so that processed images no need to be processed again.
Reliability and robustness

Separate actual detection to another container microservice to handle big files that may require more time to process. Use a message broker like RabbitMQ to pass information. For this demo, adding bounded box for the images just takes less than 35 milliseconds (so to add a broker is currently an overkill but definitely needed when requirement becomes more complex)
Scalability and High-availability

Use AWS Elastic Load Balancer (ELB) that points to an Auto Scaling Group (ASG) of more than 1 EC2 instance located in multi-availability zones (multi-AZ) for scalability and high availability Can alternatively use Docker Swarm or Kubernetes (not is not recommended now since there are only 2 containers).

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
_curltest		_curltest
_images		_images
nginx		nginx
restapi		restapi
.gitattributes		.gitattributes
.gitignore		.gitignore
Jenkinsfile		Jenkinsfile
README.md		README.md
docker-compose.yml		docker-compose.yml
swagger_openapi.json		swagger_openapi.json
swagger_openapi.yaml		swagger_openapi.yaml

richmondu/toy_application

Folders and files

Latest commit

History

Repository files navigation

Toy Application

Requirements

Background:

The challenge:

Submission Guidelines:

Design

Testing

Data Augmentation

System testing

Unit testing

Points for improvement:

About

Resources

Stars

Watchers

Forks

Languages