Skip to content

Building a search engine from the basics. I've always wanted to do this.

License

Notifications You must be signed in to change notification settings

ChengSashankh/search

Repository files navigation

Deeno Search Engine

image

The search project I've always wanted to work on.

Deeno is a search engine project that I am just beginning to undertake. The goal of this project is to really just understand what it takes to build search - so I'm building everything from the UI, the microservices and the data pipelines from scratch.

When complete, I picture search functionality on the entire Wikipedia dataset powered by microservices and Spark jobs written from scratch.

This project is built to have three components:

  1. Web interface built with Angular 15.
  2. Microservices built with the Spring framework.
  3. Indexers built using Apache Spark that update a Redis cluster.

I work on this when I have time off classes and work, so it can get quiet here at times, but its one step at a time.

The plan:

  1. Deploy and get a simple inverted index based retrieval system on the cloud.
  2. Graduate to ranked retrieval.
  3. Move to vector space retrieval using deep learning representations.
  4. Integrate question answering ability using language models.

I'm at step 1 now, and once the infrastructure is up and running, things should really accelerate. Stay tuned!

Setting up the infrastructure

I've now started configuring the infrastructure for this project as follows on Google Cloud:

Deeno Architecture.png

So I can now build individual containers like so:

gcloud builds submit --tag [IMAGE] /Users/cksash/Documents/proj/search/api/flask-aisearch

Run them individually if I wish like so:

gcloud run deploy flask-aisearch --image [IMAGE]

And run the entire project in the correct order (defined by dependencies) like so:

gcloud run services replace service.yaml

About

Building a search engine from the basics. I've always wanted to do this.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages