Skip to content

techx/pigeon

Repository files navigation

Pigeon

Overview β€’ Setup β€’ Development

Pigeon is HackMIT's RAG email assistant. Pigeon helps automates the help email workflow.

Overview

Pigeon uses Flask for its backend framework and React for its frontend framework. It is built on an in-memory Redis database that stores embedding data for documents and emails, along with Postgres for longer-term database management. The emailing service is automated with AWS, see below for details.

Directory Overview

pigeon/
β”œβ”€β”€ .devcontainer                 # Dev container configuration
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt              # Python dependencies list for backend
β”œβ”€β”€ .env                          # Stores secrets that are not in VCS
β”œβ”€β”€ wsgi.py                       # Flask entry point
β”œβ”€β”€ scripts
β”‚   β”œβ”€β”€ deploy.sh                 # Deploy to production server
β”‚   └── devcontainer_setup.sh     # Run once when the dev container initializes
└── client
β”‚Β Β  β”œβ”€β”€ public
β”‚Β Β  β”‚Β Β  └── pigeon.png
β”‚Β Β  β”œβ”€β”€ src
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ App.tsx
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ main.css
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ main.tsx
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ routes
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ documents.module.css
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ documents.tsx
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ inbox.module.css
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ inbox.tsx
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ index.module.css
β”‚Β Β  β”‚Β Β  β”‚Β Β  └── index.tsx
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ shell.tsx
β”‚Β Β  β”‚Β Β  └── vite-env.d.ts
└── server
 Β Β  β”œβ”€β”€ config.py
 Β Β  β”œβ”€β”€ controllers
 Β Β  β”‚Β Β  β”œβ”€β”€ admin.py
 Β Β  β”‚Β Β  β”œβ”€β”€ emails.py
 Β Β  β”‚Β Β  └── faq.py
 Β Β  β”œβ”€β”€ email_template
 Β Β  β”‚Β Β  └── template.html
 Β Β  β”œβ”€β”€ models
 Β Β  β”‚Β Β  β”œβ”€β”€ document.py
 Β Β  β”‚Β Β  β”œβ”€β”€ email.py
 Β Β  β”‚Β Β  β”œβ”€β”€ response.py
 Β Β  β”‚Β Β  └── thread.py
 Β Β  └── nlp
 Β Β   Β Β  β”œβ”€β”€ corpus_harvard.json
 Β Β   Β Β  β”œβ”€β”€ corpus_mit.json
 Β Β   Β Β  β”œβ”€β”€ embeddings.json
 Β Β   Β Β  β”œβ”€β”€ embeddings.py
 Β Β   Β Β  └── responses.py

Requirements

Docker. See setup instructions below.

Setup

Env

Copy .env.sample to .env and fill in the necessary values. You should be able to find them on slack.

Docker

Using Dev Containers are strongly recommended when doing development work on pigeon. Containers are a way of provisioning identical environments from computer to computer and alleviates a lot of headache when it comes to installing dependencies, setting up Postgres, etc...

To use Dev Containers you must first install Visual Studio Code and Docker. Then you must install the Remote Containers extension for Visual Studio Code.

To use Docker, install it here. To check if your installation is working, try running

docker run hello-world

If you get a message on your screen that starts with "Hello from Docker!", your installation is working.

After that, follow this tutorial to get your environment set up. Make sure you open this repository as a dev container in VSCode.

Note: It can take a few minutes to provision the container if it is your first time starting it up.

Development

To start the server, run

python3 wsgi.py

To start the client, in a different terminal, run

cd client
npm run dev

The postgres and redis services are running on the same network as the dev container, but they can only communicate with each other via the designated service names, which are database and redis respectively. If you want to view these services from inside the dev container, you can use the following commands:

# postgres
PGPASSWORD='password' psql -h database -U postgres

# redis
redis-cli -h redis

Check redis keys with

keys *

Alternatively, you can access these services from your local machine, i.e., outside of the dev container, by connecting directly to the docker containers. To do this, run

docker container ls

and retrieve the container id of your desired instance (e.g., pigeon_devcontainer-database-1). Then, run

docker exec -it <container_id> /bin/bash

to enter the container, from which you should be able to run psql or redis-cli directly.

Installing Python Dependencies

Put all direct dependencies (i.e., packages we directly import) in requirements.in. pigar can be used to automate part of this process. Then, run

pip-compile

to generate requirements.txt, which contains pinned versions of subdependencies as well.

AWS

All emails are forwarded to Pigeon through AWS. More specifically, emails are received with a receipt rule and forwarded to an S3 bucket, which are then processed and forwarded to the api with a lambda. The receiving and sending rules are both handled by SES.

For instructions on setting up locally, see go/pigeon-aws.