elasticsearch - shard 개수와 replica 개수 정하기

프로그래밍/검색 2016. 8. 5. 21:11

엘라스틱 서치 권장 사항 저장용

https://www.elastic.co/guide/en/elasticsearch/guide/current/scale.html

https://www.elastic.co/guide/en/elasticsearch/guide/current/shard-scale.html

https://www.elastic.co/guide/en/elasticsearch/guide/current/overallocation.html

PUT /my_index
{
  "settings": {
    "number_of_shards":   2, 
    "number_of_replicas": 0
  }
}

This time, when we add a second node, Elasticsearch will automatically move one shard from the first node to the second node, as depicted in Figure 50, “An index with two shards can take advantage of a second node”. Once the relocation has finished, each shard will have access to twice the computing power that it had before.

A new index in Elasticsearch is allotted five primary shards by default. That means that we can spread that index out over a maximum of five nodes, with one shard on each node. That’s a lot of capacity, and it happens without you having to think about it all.

https://www.elastic.co/guide/en/elasticsearch/guide/current/kagillion-shards.html

A shard is not free. Remember:

A shard is a Lucene index under the covers, which uses file handles, memory, and CPU cycles.
Every search request needs to hit a copy of every shard in the index. That’s fine if every shard is sitting on a different node, but not if many shards have to compete for the same resources.
Term statistics, used to calculate relevance, are per shard. Having a small amount of data in many shards leads to poor relevance.

https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html

Replication is important for two primary reasons:

It provides high availability in case a shard/node fails. For this reason, it is important to note that a replica shard is never allocated on the same node as the original/primary shard that it was copied from.
It allows you to scale out your search volume/throughput since searches can be executed on all replicas in parallel.

To summarize, each index can be split into multiple shards. An index can also be replicated zero (meaning no replicas) or more times. Once replicated, each index will have primary shards (the original shards that were replicated from) and replica shards (the copies of the primary shards). The number of shards and replicas can be defined per index at the time the index is created. After the index is created, you may change the number of replicas dynamically anytime but you cannot change the number shards after-the-fact.

https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html

However, replica shards can serve read requests. If, as is often the case, your index is search heavy, you can increase search performance by increasing the number of replicas, but only if you also add extra hardware.

Let’s return to our example of an index with two primary shards. We increased capacity of the index by adding a second node. Adding more nodes would not help us to add indexing capacity, but we could take advantage of the extra hardware at search time by increasing the number of replicas:

728x90

저작자표시 (새창열림)

'프로그래밍 > 검색' 카테고리의 다른 글

elasticsearch 5.1.0 설치하기 (1)	2017.05.21
elasticsearch 2.3.2 에 한국어 형태소 (은전한닢) 사용하기 (0)	2016.10.08
elasticsearch 2.3 - server ip 주소에 Connection refused 에러 (0)	2016.08.03
elasticsearch 2.3 에 plugins 설치하기 (1)	2016.06.26
elasticsearch 시작하기 - slowlog 사용하기 (0)	2016.06.26

ABOUT ME

you've got to find what you love. you've got to find what you love.

'프로그래밍 > 검색' 카테고리의 다른 글

티스토리툴바

ABOUT ME

'프로그래밍 > 검색' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바