Skip to content

different-ai/embedbase-internet-search

Repository files navigation

embedbase-internet-search

embedbase-internet-search - internet search extension for Embedbase

⚠️ Status: Alpha release ⚠️

Discord PyPI

Code style: black License Coverage Report

The point of internet search in embedbase is to combine your private information with latest public information.

Also remember that AIs like ChatGPT have limited knowledge to a certain date, for example try to ask ChatGPT about GPT4 or about Sam Altman talk with the senate (which happened few days ago), it will not know about it.

New.Recording.May.25.2023.0829.PM.mp4

Please check examples for usage or keep reading.

Quick tour

Here's an example to answer general purpose questions using embedbase hosted:

The recommended workflow is like this:

  1. search your question using internet endpoint
  2. (optional) add results to embedbase
  3. (optional) search embedbase with the question
  4. use .generate() to get your question answered
// npm i embedbase-js
import { createClient } from 'embedbase-js'

const formatInternetResultsInPrompt = (internetResult: any) =>
    `Name: ${internetResult.title}
Snippet: ${internetResult.snippet}
Url: ${internetResult.url}`


const system = `You are an AI assistant that can answer questions.
When a user send a question, we will answer its question following these steps:
1. we will search the internet with the user's question.
2. we will ask you to answer the question based on the internet results.`

const fn = async () => {
    const embedbase = createClient('https://api.embedbase.xyz', process.env.EMBEDBASE_API_KEY)

    // get question from process.argv
    const question = process.argv[2]

    if (!question) {
        console.log('Please provide a question as argument, for example "What is GPT4?"')
        return
    }

    const results = await embedbase.internetSearch(question)

    const answerQuestionPrompt = `Based on the following internet search results:
${results.map(formatInternetResultsInPrompt).join('\n')}
\n
Please answer the question: ${question}`

    for await (const result of embedbase.generate(answerQuestionPrompt, {
        history: [{
            role: 'system',
            content: system
        }],
    })) {
        console.log(result)
    }
}

fn()
EMBEDBASE_API_KEY="<get me here https://app.embedbase.xyz>" npx tsx answer.ts "What did Sam Altman say to to US Senate lately?"

Self-hosted usage

Just add two lines to your original embedbase entrypoint:

import os
import uvicorn
from embedbase import get_app
from embedbase.database.memory_db import MemoryDatabase
from embedbase.embedding.openai import OpenAI
# import this
from embedbase_internet_search import internet_search

app = get_app().use_db(MemoryDatabase()).use_embedder(OpenAI(os.environ["OPENAI_API_KEY"])).run()
# add the new endpoint
app.add_api_route("/internet-search", internet_search, methods=["POST"])

if __name__ == "__main__":
    uvicorn.run(app)

Check how it is used in Embedbase hosted

If you have any feedback or issues, please let us know by opening an issue or contacting us on discord.

Regarding the SDK, please refer to the documentation.