yuri is a program designed to take your image, and analyze it to find whatever is in it, and its colour. It accomplishes this by using the Mask-RCNN object detection method. Mask-RCNN outputs a mask, an image's location, and it's label. The mask (a 4D array) is then sent to a function which uses the KMeans method to determine the most prominent color in the mask. This is used to determine the name of the color.
This program is implemented as a Flask server. Each object detection process is run asynchronously so as to support multiple object detection jobs at a time.
yuri is designed with ease of use in mind. There are a few steps to running yuri on your system. NOTE: yuri requires a relatively powerful CPU or GPU to work. In addition, it requires at least 8 GB of RAM to function.
-
Install dependencies The dependencies are listed in
requirements.txt
. Install them, and there shouldn't be any issue. -
Configure based on preferred computation target. yuri is preconfigured for use with the CPU. If you want to use the GPU to do the computation, change the line in
src/mask_rcnn.py
self.net.setPreferableTarget(cv.dnn.DNN_TARGET_CPU)
to
self.net.setPreferableTarget(cv.dnn.DNN_TARGET_OPENCL)
-
Launch Flask server Launch a terminal, and in the yuri folder, run
python src/controller.py
. This will start a Flask server that you can view atlocalhost:5000
. You may change the host by navigating tocontroller.py
, line 105 and inapp.run()
, adding an optional parameterhost=255.255.255.255
. -
Using yuri Once you have navigated to the website, you can click on the "Drop file here" box, and select an image or video. Modifying the two dropdown menus will allow you to select what objects are detected by yuri. Finally, pressing "Upload" will launch the object detection process in an Async thread. When the process is complete, you will be redirected to the resulting image.
Example:
- David Gurevich - Team Lead/Machine Learning Engineer
- Kenan Liu - Back-end Software Engineer
- AJ Heft - Project Manager
- Daniel Madan - Web Developer
This project is licensed under the GNU Lesser General Public License v3.0