GitHub - openfoodfacts/search-a-licious: 🍊🔎 A pluggable search service for large collections of objects (like Open Food Facts)

NOTE: this is a prototype which will be heavily evolved to be more generic, more robust and have much more functionalities.

This API is currently in development. Read Search-a-licious roadmap architecture notes to understand where we are headed.

Organization

The main file is api.py, and the schema is in models/product.py.

A CLI is available to perform common tasks.

Running locally

Note: the Makefile will align the user id with your own uid for a smooth editing experience.

Before running the services, you need to make sure that your system mmap count is high enough for Elasticsearch to run. You can do this by running:

sudo sysctl -w vm.max_map_count=262144

Then build the services with:

make build

Start docker:

docker compose up -d

Note

You may encounter a permission error if your user is not part of the docker group, in which case you should either add it or modify the Makefile to prefix sudo to all docker and docker compose commands.

Docker spins up:

Two elasticsearch nodes
Elasticvue
The search service on port 8000
Redis on port 6379

You will then need to import from a JSONL dump (see instructions below).

Development

For development, you have two options for running the service:

Docker
Locally

To develop on docker, make the changes you need, then build the image and compose by running:

make up

However, this tends to be slower than developing locally.

To develop locally, create a venv, install dependencies, then run the service:

virtualenv .
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.api:app --reload --port=8001 --workers=4

Note that it's important to use port 8001, as port 8000 will be used by the docker version of the search service.

Pre-Commit

This repo uses pre-commit to enforce code styling, etc. To use it:

pre-commit install

To run tests without committing:

pre-commit run

Running the import:

To import data from the JSONL export, download the dataset in the data folder, then run:

make import-dataset filepath='products.jsonl.gz'

If you get errors, try adding more RAM (12GB works well if you have that spare), or slow down the indexing process by setting num_processes to 1 in the command above.

Typical import time is 45-60 minutes.

If you want to skip updates (eg. because you don't have a Redis installed), use make import-dataset filepath='products.jsonl.gz' args="--skip-updates"

Fundings

This project has received financial support from the NGI Search (New Generation Internet) program, funded by the European Commission.

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
.cov		.cov
.github		.github
app		app
assets		assets
confs		confs
data		data
docker		docker
docs		docs
frontend		frontend
tests		tests
.env		.env
.env.openfoodfacts		.env.openfoodfacts
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config_schema.json		config_schema.json
docker-compose.yml		docker-compose.yml
json_schema.json		json_schema.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

openfoodfacts/search-a-licious

Folders and files

Latest commit

History

Repository files navigation

Organization

Running locally

Development

Pre-Commit

Running the import:

Fundings

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Sponsor this project

Languages