Skip to content
This repository has been archived by the owner on Jan 18, 2019. It is now read-only.
/ QianMo Public archive

My final project of EE208. HTML search and face search.

License

Notifications You must be signed in to change notification settings

izackwu/QianMo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QianMo

What's this?

QianMo, my final project of Course EE208. It's a search engine for some particular information of SJTU. It supports two types of search: HTML and face. For HTML search, with a query string, you can get some related web pages; for face search, you upload a photo of somebody and get many other photos of this guy.

How does it work?

The website is built with web.py.

For HTML search, this project uses ElasticSearch to index and retrieve web pages. Fisrt we extract information such like title and content from HTML files and index it with ElasticSearch. Then with a query string, ElasticSearch will handle (almost) everything for us.

For face search, basicly this project uses OpenCV, MTCNN and FaceNet. More concretely, OpenCV is used to roughly filter out images that don't contain any face. After that, we use MTCNN to detect and crop faces from every single image and embed every face into a 512-dimension vector with FaceNet. And for search, just compute the feature vector of the givin image and do the brute force search to find most similar faces of it.

By the way, when handling image uploading, I use Uppy, which is really awesome.

Where are the data from?

All data are collected from some websites of SJTU with a crawler.

How does it look like?

There're some screenshots.

HTML search

Face search

More details?

You can find more details in my project report, which is in Chinese.