Skip to content

knap-ai/knapsack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Knapsack πŸŽ’ - Data connectors for fast, private AI.

Title and Description πŸ“

Knapsack πŸŽ’ is a open-source service that hosts and runs fast, private connectors for data to AI projects. Much like Glean or Perplexity, Knapsack πŸŽ’ powers intelligent search and next-gen AI applications, but with an emphasis on community, privacy, and security.

Installation and Setup βš™οΈ

Knapsack connectors fetch data, transform, and load that data into a VectorDB backend. Efficient, secure, and easy data handling is our bread and butter. To this end, Knapsack πŸŽ’ provides a simple, easy-to-use API for data connectors and the service can be launched via Docker.

To get started with Knapsack πŸŽ’, ensure you have Docker installed on your machine. You can launch the service using Docker Compose:

  1. Clone the repository to your local machine:
    git clone https://github.com/your-repo/knapsack.git
    cd knapsack
    
    
  2. Run it as a FastAPI server on your local machine:
python -m knapsack.cli deploy --port 8888
  1. Or run it as a FastAPI server in a Docker container
docker-compose up
  1. Or utilize it directly as a library:
from knapsack import Knapsack
ks = Knapsack()
ks.run()

Please note that the main_url property in the knapsack.toml database configuration must be set to the database image name (e.g., qdrant if using Docker Compose) or to localhost if running in the local environment

Roadmap πŸ”¨

  • ArXiv, Base connector
  • Qdrant integration
  • Caching of certain APIs
  • Smart upsert to vector DB (hashed values, only upsert on change)
  • Scheduling
  • GSuite
  • BioArXiv
  • PubMed

VectorDB Integrations

  • Qdrant
  • Milvus
  • Weaviate
  • Chroma

How to Contribute 🀝

We welcome contributions from the community! Currently, we are particularly interested in adding more connectors. If you have developed a connector that could be useful to others, please consider submitting a pull request.

For those interested in public data, Knapsack πŸŽ’ hosts publicly-accessible datasets, such as data derived from ArXiv, available for search and GPT chat via the Knapsack Desktop application. If you want to contribute to Knapsack πŸŽ’ could, please reach out via our GitHub issues or file a pull request. Knap will host any new connectors that connect public data so that all users can take benefit from the abilities of LLM chat and search.

License Information πŸ“„

Knapsack πŸŽ’ is released under the GNU General Public License v3.0. For more information, please refer to the LICENSE file in the repository.

Feel free to explore, modify, and distribute any part of Knapsack's πŸŽ’ codebase. If you use Knapsack πŸŽ’ in your research or projects, please consider citing it.