GitHub - dannyshmueli/bringg-help-gpt: a Next JS interface with LangChain backend. Uses the data from help.bringg.com to answer questions

This is a Next.js project bootstrapped with create-next-app.

Getting Started

First, create a new .env file from .env.example and add your OpenAI API key found here.

cp .env.example .env

Prerequisites

Node.js (v16 or higher)
Yarn
wget (on macOS, you can install this with brew install wget)

Next, we'll need to load our data source.

Data Ingestion

Data ingestion happens in two steps.

First, you should run

sh download.sh

This will download all https://help.bringg.com/v1/docs/ webpages.

Next, install dependencies

yarn

before ingesting the data, consider to cleanup the html data by replacing html with text:

yarn cleanu-data && find help.bringg.com -type f \( -name "*[^.]*" ! -name "*.*" \) -exec rm {} \;

and then run the ingestion script:

yarn ingest

this can take 5 minutes - consider cleaning up the help.bringg.com folder before.

Note: If on Node v16, use NODE_OPTIONS='--experimental-fetch' yarn ingest

This will parse the data, split text, create embeddings, store them in a vectorstore, and then save it to the data/ directory.

We save it to a directory because we only want to run the (expensive) data ingestion process once.

The Next.js server relies on the presence of the data/ directory. Please make sure to run this before moving on to the next step.

Running the Server

Then, run the development server:

yarn dev

Open http://localhost:3000 with your browser to see the result.

Deploying the server

To deploy your own server on Fly, you can use the provided fly.toml and Dockerfile as a starting point.

Note: As a Next.js app it seems like Vercel is a natural place to host this site. Unfortunately there are limitations to secure websockets using ws with Next.js which requires using a custom server which cannot be hosted on Vercel. Even using server side events, it seems, Vercel's serverless functions seem to prohibit streaming responses (e.g. see here)

Inspirations

This repo borrows heavily from

ChatLangChain - for the backend and data ingestion logic
LangChain Chat NextJS - for the frontend.

How To Run on Your Example

If you'd like to chat your own data, you need to:

Set up your own ingestion pipeline, and create a similar data/ directory with a vectorstore in it.
Change the prompt used in pages/api/util.ts - right now this tells the chatbot to only respond to questions about LangChain, so in order to get it to work on your data you'll need to update it accordingly.

The server should work just the same 😄

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
ingest		ingest
pages		pages
public		public
styles		styles
.dockerignore		.dockerignore
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
cleanup-data.ts		cleanup-data.ts
download.sh		download.sh
fly.toml		fly.toml
ingest.ts		ingest.ts
next.config.js		next.config.js
package.json		package.json
tsconfig.json		tsconfig.json
vercel.json		vercel.json
yarn.lock		yarn.lock

dannyshmueli/bringg-help-gpt

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Prerequisites

Data Ingestion

Running the Server

Deploying the server

Inspirations

How To Run on Your Example

About

Resources

Stars

Watchers

Forks

Languages