Skip to content

(Demo) Elasticsearch with ML node and ingest pipeline for hybrid search (Lexical + Semantic)

License

Notifications You must be signed in to change notification settings

pakio/EsBM25SemanticHybridComparison

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Elasticsearch BM25 vs KNN vs Hybrid comparison system

This is the ready-to-use demo repository to test Elasticsearch semantic search with embedded transformer using ingest pipeline.

Demo

  • Data
    • wikimedia enwiki 20221201 dump url
  • Model

prerequisites

This repository uses the softwares/tools/frameworkds below.

  • docker
  • docker-compose
  • python (>3.10)

How to run

1. Launch Elasticsearch and upload model

Run ./Es/setup.sh to launch Elasticsearch, upload model, and configure the ingest pipeline.

2. Ingest data

Run ./indexer/setup.sh to download, and index the data. If you observe 429 error, reduce the batch size and please retry.

3. Launch comparison tool

There is a GUI comparison tool under ./eval directory. Go ./eval directory and run streamlit run main.py to launch the comparison tool.

note

This repository enables Elasticsearch Trial License inorder to use ML node to run embedding transformer model in ingest pipeline.

About

(Demo) Elasticsearch with ML node and ingest pipeline for hybrid search (Lexical + Semantic)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published