ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • elasticsearch - shard 개수와 replica 개수 정하기
    프로그래밍/검색 2016. 8. 5. 21:11
    반응형

    엘라스틱 서치 권장 사항 저장용



    https://www.elastic.co/guide/en/elasticsearch/guide/current/scale.html


    https://www.elastic.co/guide/en/elasticsearch/guide/current/shard-scale.html


    https://www.elastic.co/guide/en/elasticsearch/guide/current/overallocation.html


    PUT /my_index
    {
      "settings": {
        "number_of_shards":   2, 
        "number_of_replicas": 0
      }
    }


    This time, when we add a second node, Elasticsearch will automatically move one shard from the first node to the second node, as depicted in Figure 50, “An index with two shards can take advantage of a second node”. Once the relocation has finished, each shard will have access to twice the computing power that it had before.


    A new index in Elasticsearch is allotted five primary shards by default. That means that we can spread that index out over a maximum of five nodes, with one shard on each node. That’s a lot of capacity, and it happens without you having to think about  it all.


    https://www.elastic.co/guide/en/elasticsearch/guide/current/kagillion-shards.html

    A shard is not free. Remember:

    • A shard is a Lucene index under the covers, which uses file handles, memory, and CPU cycles.
    • Every search request needs to hit a copy of every shard in the index. That’s fine if every shard is sitting on a different node, but not if many shards have to compete for the same resources.
    • Term statistics, used to calculate relevance, are per shard. Having a small amount of data in many shards leads to poor relevance.



    https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html



    Replication is important for two primary reasons:

    • It provides high availability in case a shard/node fails. For this reason, it is important to note that a replica shard is never allocated on the same node as the original/primary shard that it was copied from.
    • It allows you to scale out your search volume/throughput since searches can be executed on all replicas in parallel.

    To summarize, each index can be split into multiple shards. An index can also be replicated zero (meaning no replicas) or more times. Once replicated, each index will have primary shards (the original shards that were replicated from) and replica shards (the copies of the primary shards). The number of shards and replicas can be defined per index at the time the index is created. After the index is created, you may change the number of replicas dynamically anytime but you cannot change the number shards after-the-fact.



    https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html


    However, replica shards can serve read requests. If, as is often the case, your index is search heavy, you can increase search performance by increasing the number of replicas, but only if you also add extra hardware.

    Let’s return to our example of an index with two primary shards. We increased capacity of the index by adding a second node. Adding more nodes would not help us to add indexing capacity, but we could take advantage of the extra hardware at search time by increasing the number of replicas:



    728x90
    반응형
Designed by Tistory.