What's this?

QianMo, my final project of Course EE208. It's a search engine for some particular information of SJTU. It supports two types of search: HTML and face. For HTML search, with a query string, you can get some related web pages; for face search, you upload a photo of somebody and get many other photos of this guy.

How does it work?

The website is built with web.py.

For HTML search, this project uses ElasticSearch to index and retrieve web pages. Fisrt we extract information such like title and content from HTML files and index it with ElasticSearch. Then with a query string, ElasticSearch will handle (almost) everything for us.

For face search, basicly this project uses OpenCV, MTCNN and FaceNet. More concretely, OpenCV is used to roughly filter out images that don't contain any face. After that, we use MTCNN to detect and crop faces from every single image and embed every face into a 512-dimension vector with FaceNet. And for search, just compute the feature vector of the givin image and do the brute force search to find most similar faces of it.

By the way, when handling image uploading, I use Uppy, which is really awesome.

Where are the data from?

All data are collected from some websites of SJTU with a crawler.

How does it look like?

There're some screenshots.

HTML search

Face search

More details?

You can find more details in my project report, which is in Chinese.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Crawler		Crawler
Images		Images
Index		Index
Preprocessor		Preprocessor
Presentation		Presentation
Report		Report
Website		Website
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crawler

Crawler

Images

Images

Index

Index

Preprocessor

Preprocessor

Presentation

Presentation

Report

Report

Website

Website

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

What's this?

How does it work?

Where are the data from?

How does it look like?

More details?

About

Releases

Packages

Languages

License

izackwu/QianMo

Folders and files

Latest commit

History

Repository files navigation

What's this?

How does it work?

Where are the data from?

How does it look like?

More details?

About

Topics

Resources

License

Stars

Watchers

Forks

Languages