Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Is there some performance data? #173

Open
JamesIsHuang opened this issue Jul 22, 2020 · 17 comments
Open

Is there some performance data? #173

JamesIsHuang opened this issue Jul 22, 2020 · 17 comments
Assignees
Labels
question Further information is requested

Comments

@JamesIsHuang
Copy link

Is there some performance data? Recently we want to use this plugin, but don’t know how the performance is?

@vamshin
Copy link
Member

vamshin commented Jul 22, 2020

Hi @JamesIsHuang,

At this moment we do not have the numbers documented. We are planning to come up with dedicated blog on this.
Meanwhile this blog could give some insights https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb.

@vamshin vamshin added the question Further information is requested label Jul 22, 2020
@JamesIsHuang
Copy link
Author

JamesIsHuang commented Jul 23, 2020

Hi @JamesIsHuang,

At this moment we do not have the numbers documented. We are planning to come up with dedicated blog on this.
Meanwhile this blog could give some insights https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb.

For tens of millions or 100 million of vectors, have not been tested yet? It seems that the million-level performance is quite good.

@vamshin
Copy link
Member

vamshin commented Jul 23, 2020

Hi @JamesIsHuang ,

We have done performance analysys for different vector dimensions and collection. We need t formalize and put in the consumable manner. We are prioritizing the effort to bring this to the doc.

Here are some metrics for the scale you are looking at.

Data set:- 150M vectors with 128 dimensions across different indices.
Algo params :- m=16, efSearch=1024, efConstruction=1024,
No of data nodes :- 6, m5.12xlarge
Mater nodes :- 3, m5.xlarge

Latencies:-
tp50: 22ms
tp90: 40ms
tp99: 90ms

@JamesIsHuang
Copy link
Author

Hi, @vamshin
This performance is really good. Did you return 1024 vectors when you searched?

@vamshin
Copy link
Member

vamshin commented Jul 23, 2020

Hi @JamesIsHuang, we made k dynamic and its between 50 and 1500.

Please note, we also did warmup to have graphs loaded to memory and our experiment do not account warm up time. With out warm up, initial query will take hit.

@JamesIsHuang
Copy link
Author

Hi, @vamshin , I found that segment merging is a very slow process, how long did it take you to merge 150 million vectors?

@vamshin
Copy link
Member

vamshin commented Aug 4, 2020

Hi @JamesIsHuang, sorry i dont have exact numbers but we could do that effectively by

  1. Avoiding creating multiple smaller segments. Please refer doc for indexing performance tuning

  2. Having more shards so that graphs are split into these shards and forcemerge could work on smaller graphs.

@juliusbachnick
Copy link

juliusbachnick commented Apr 7, 2021

Hey @vamshin,

Are there any updates regarding the documentation of performance data? I'm currently doing some benchmarks for a project that utilizes the kNN Plugin for ES and keep running into degrading latencies over time. In order to facilitate debugging, it would be interesting to know what resources are required to calculate the neighbors within <=50ms.

I'm currently running 3 m5.xlarge.elasticsearch (without sharding), using 5 indices with a rather small count of documents:

docs.count
13867
11315
53216
1459242
1302

and the following index settings:

    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "refresh_interval": "10s",
      "index.knn": true,
      "index.knn.algo_param.ef_search": 512,
      "index.knn.algo_param.ef_construction": 512,
      "index.knn.algo_param.m": 32
    }

and am storing vectors with a dimension of 100.

I see that you used 6 m5.12xlarge.elasticsearch nodes which would indicate that the kNN Plugin requires CPU as well as Memory to perform appropriately?

Any help with debugging or improving latencies would be greatly appreciated :).

@qingfengshiran
Copy link

Hey @vamshin,

Are there any updates regarding the documentation of performance data? I'm currently doing some benchmarks for a project that utilizes the kNN Plugin for ES and keep running into degrading latencies over time. In order to facilitate debugging, it would be interesting to know what resources are required to calculate the neighbors within <=50ms.

I'm currently running 3 m5.xlarge.elasticsearch (without sharding), using 5 indices with a rather small count of documents:

docs.count
13867
11315
53216
1459242
1302

and the following index settings:

    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "refresh_interval": "10s",
      "index.knn": true,
      "index.knn.algo_param.ef_search": 512,
      "index.knn.algo_param.ef_construction": 512,
      "index.knn.algo_param.m": 32
    }

and am storing vectors with a dimension of 100.

I see that you used 6 m5.12xlarge.elasticsearch nodes which would indicate that the kNN Plugin requires CPU as well as Memory to perform appropriately?

Any help with debugging or improving latencies would be greatly appreciated :).

Hi juliusbachnick
How about your performance test now? I use 4 m5.12xlarge.elasticsearch 130M data, but query cost 200ms every, qps just 50.

@qingfengshiran
Copy link

Hi @vamshin , I have a question, JVM heap size is 32G, so that KNN can use is 32GB * 50% * 60% = 9.6GB, and The memory required for graphs is estimated to be 1.1 * (4 * dimension + 8 * M) bytes/vector, if I have dim 128, m = 16, so that 1.1 * (4 128 + 8 * 16) * 1,000,000 ~= 0.8 GB, I think if I want to be the best performance so that one device can only support 10M data, 100.8 = 8G less than 9.8, right?

@vamshin
Copy link
Member

vamshin commented Apr 28, 2021

Hi @qingfengshiran,

KNN graphs are loaded outside the ES heap. So this is not part of 32GB heap size.

For Example: Consider a machine has 100 GB of memory and the JVM uses 32 GB, the k-NN plugin uses 50% of the remaining 68 GB(i.e 100GB-32GB) which is 34 GB. If memory usage exceeds this value, KNN removes the least recently used graphs and can impact search performance.

@qingfengshiran
Copy link

Hi @vamshin ,I have use 4 node each node 32 core,192G memory
Data set:- 130M vectors with 128 dimensions across different indices.
Algo params :- m=16, efSearch=1024, efConstruction=1024 number_of_shards: 8,
But my top1 K = 4 every search cost about 200ms,QPS just around 50,what can I do to check my performance?
The segment 0 as follows,others same like this.
WeChatWorkScreenshot_14e40ea6-7008-4cb3-910c-8b0b36ebbba1

@qingfengshiran
Copy link

@vamshin ,My request is about 2000QPS from search data size is about higher than 400M data number which dims is 128.
Can you give me some cluster device configuration and es-knn parameters?

@vamshin
Copy link
Member

vamshin commented Apr 29, 2021

Hi @qingfengshiran,

I could see couple of suggestions

  1. Increase number of shards to 16. This way we can get more parallelism.
  2. See if you can bring down efSearch to 512 (note your recall might come down, you may want to double check)
  3. Forcemerge to fewer segments. Limit number of segments to 5 per shard. Inside the shards, segments are searched sequentially, so more number of segments would result in more latency. Also please note, forcemerge is costly operation and one node only has one thread, so it can take really long time at your scale. One way to speed up is scale horizontally by adding more instances and increase number of shards, so that work can be more distributed.

You could find more details here https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/#search-performance-tuning

@qingfengshiran
Copy link

qingfengshiran commented Apr 29, 2021

Hi @vamshin
Thanks for kindly reply, I also want know some other informations:
1 How is the QPS on more than 100 Million data?
2 I have found shards config suggestion: Number of Shards = Index Size / 30GB, and index size is store size?
image

3 At last I want to know docker compose cluster impl method, you have provide a cluster implant sample, but it is a cluster on on computer device, I want to cross different computers, one node on one device, I have tried but failed.
docker compose implies on on device with 3 nodes: https://opendistro.github.io/for-elasticsearch-docs/docs/install/docker/

------data node
version: '3'
services:
odfe-node1:
image: amazon/opendistro-for-elasticsearch:1.13.2
container_name: odfe-node1
environment:
- cluster.name=odfe-cluster
- network.host=0.0.0.0
- node.name=odfe-node1
- node.master=true
- node.ingest=true
- node.data=true
- cluster.initial_master_nodes=odfe-master1
# - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233
- discovery.seed_hosts=10.0.246.36,10.0.245.233
# - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233,10.0.245.10
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
- "ES_JAVA_OPTS=-Xms32768m -Xmx32768m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems
hard: 65536
volumes:
- /data/es/data/data:/usr/share/elasticsearch/data
- /data/es/data/log:/usr/share/elasticsearch/log
ports:
- 9200:9200
- 9300:9300
- 9600:9600 # required for Performance Analyzer
networks:
- odfe-net
networks:
odfe-net:

docker-compose up. with the follow error
image

-----master node
version: '3'
services:
odfe-master1:
image: amazon/opendistro-for-elasticsearch:1.13.2
container_name: odfe-master1
environment:
- cluster.name=odfe-cluster
- network.host=0.0.0.0
- node.name=odfe-master1
- node.master=true
- node.data=false
- node.ingest=true
# - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233,10.0.245.10
# - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233
- discovery.seed_hosts=10.0.246.36,10.0.245.233
- cluster.initial_master_nodes=odfe-master1
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
- "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems
hard: 65536
volumes:
- /data/es/master/data:/usr/share/elasticsearch/data
- /data/es/master/log:/usr/share/elasticsearch/log
ports:
- 9200:9200
- 9300:9300
- 9600:9600 # required for Performance Analyzer
networks:
- odfe-net
networks:
odfe-net:

docker-compose up. with the follow error
image

4 How to Warm up the index?? I have see the website, but I did not find where to set the warm up
https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/

@jmazanec15
Copy link
Member

Hi @qingfengshiran

1 How is the QPS on more than 100 Million data?

We have seen QPS ranging from 100s of ms to 2000ms for around 100 million vectors, depending on memory, cpu type, number of nodes, dimensions, etc. In order to get good performance, all of the vectors need to fit in memory. You can check this from the knn stats after running warmup or some queries.

2 I have found shards config suggestion: Number of Shards = Index Size / 30GB, and index size is store size?

yes

3 At last I want to know docker compose cluster impl method, you have provide a cluster implant sample, but it is a cluster on on computer device, I want to cross different computers, one node on one device, I have tried but failed.

We will get back to you on this.

4 How to Warm up the index?? I have see the website, but I did not find where to set the warm up

Warmup documentation can be found here.

@VijayanB
Copy link
Member

VijayanB commented May 10, 2021

Hi @qingfengshiran

For docker container, i believe it doesn't matter whether your service (in this case data node) is in same docker-compose file or different.

I ran following experiment and it worked successfully for me.

  1. Created a docker compose file (data.yaml) with two nodes and kibana as below
version: '3'
services:
  odfe-node1:
    image: amazon/opendistro-for-elasticsearch:1.13.2
    container_name: odfe-node1
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node1
      - discovery.seed_hosts=odfe-node1,odfe-node2,odfe-node3
      - cluster.initial_master_nodes=odfe-node1,odfe-node2,odfe-node3
      - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems
        hard: 65536
    volumes:
      - odfe-data1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9600:9600 # required for Performance Analyzer
    networks:
      - odfe-net
  odfe-node2:
    image: amazon/opendistro-for-elasticsearch:1.13.2
    container_name: odfe-node2
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node2
      - discovery.seed_hosts=odfe-node1,odfe-node2
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - odfe-data2:/usr/share/elasticsearch/data
    networks:
      - odfe-net
  kibana:
    image: amazon/opendistro-for-elasticsearch-kibana:1.13.2
    container_name: odfe-kibana
    ports:
      - 5601:5601
    expose:
      - "5601"
    environment:
      ELASTICSEARCH_URL: https://odfe-node1:9200
      ELASTICSEARCH_HOSTS: https://odfe-node1:9200
    networks:
      - odfe-net

volumes:
  odfe-data1:
  odfe-data2:

networks:
  odfe-net:
  1. executed docker-compose up
    Note: You will see exception that odfe-node3 is not reachable, that is fine since we haven't started that node.
docker-compse -f data.yaml up 


docker ps

3c22fb3bf9b7:~ balasvij$ docker ps
CONTAINER ID   IMAGE                                               COMMAND                  CREATED          STATUS          PORTS                                                                NAMES
be5925a3eacc   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   2 minutes ago    Up 2 minutes    0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp   odfe-node1
ac3a3e8aff1d   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   17 minutes ago   Up 2 minutes    9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp                               odfe-node2
7f96f019bda0   amazon/opendistro-for-elasticsearch-kibana:1.13.2   "/usr/local/bin/kiba…"   17 minutes ago   Up 2 minutes    0.0.0.0:5601->5601/tcp                                               odfe-kibana


curl https://localhost:9200/_cat/nodes -u admin:admin --insecure
172.17.0.2 38 81 14 1.41 0.67 0.38 dimr - odfe-node2
172.17.0.4 42 81  9 1.41 0.67 0.38 dimr * odfe-node1

  1. Create docker-compose file node3.yaml as below
version: '3'
services:
  odfe-node3:
    image: amazon/opendistro-for-elasticsearch:1.13.2
    container_name: odfe-node3
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node3
      - discovery.seed_hosts=odfe-node1,odfe-node2,odfe-node3
      - cluster.initial_master_nodes=odfe-node1,odfe-node2.odfe-node3
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - odfe-data3:/usr/share/elasticsearch/data
    networks:
      - odfe-net
volumes:
  odfe-data3:

networks:
  odfe-net:
  1. execute node3.yaml
docker-compose -f node3.yaml up

3c22fb3bf9b7:~ balasvij$ docker ps
CONTAINER ID   IMAGE                                               COMMAND                  CREATED          STATUS          PORTS                                                                NAMES
6ad96f65fcbe   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   35 seconds ago   Up 32 seconds   9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp                               odfe-node3
be5925a3eacc   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   2 minutes ago    Up 2 minutes    0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp   odfe-node1
ac3a3e8aff1d   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   17 minutes ago   Up 2 minutes    9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp                               odfe-node2
7f96f019bda0   amazon/opendistro-for-elasticsearch-kibana:1.13.2   "/usr/local/bin/kiba…"   17 minutes ago   Up 2 minutes    0.0.0.0:5601->5601/tcp                                               odfe-kibana

curl https://localhost:9200/_cat/nodes -u admin:admin --insecure
172.17.0.2 20 97 10 1.02 0.68 0.40 dimr - odfe-node2
172.17.0.5 44 97 26 1.02 0.68 0.40 dimr - odfe-node3
172.17.0.4 47 97 10 1.02 0.68 0.40 dimr * odfe-node1


Let me know if you still have issues. I executed the above experiment in MacOs with docker

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants