Multiple indices - Elasticsearch 权威指南中文版


Multiple indices

发布于 2019-07-04 字数 3613 浏览 804 评论 0

=== Multiple Indices

Finally, remember that there is no rule that limits your application to using
only a single index.(((“scaling”, “using multiple indices”)))(((“indices”, “multiple”))) When we issue a search request, it is forwarded to a
copy (a primary or a replica) of all the shards in an index. If we issue the
same search request on multiple indices, the exact same thing happens–there
are just more shards involved.

TIP: Searching 1 index of 50 shards is exactly equivalent to searching
50 indices with 1 shard each: both search requests hit 50 shards.

This can be a useful fact to remember when you need to add capacity on the
fly. Instead of having to reindex your data into a bigger index, you can
just do the following:

  • Create a new index to hold new data.
  • Search across both indices to retrieve new and old data.

In fact, with a little forethought, adding a new index can be done in a
completely transparent way, without your application ever knowing that
anything has changed.

In <>, we spoke about using an index alias to point to the
current version of your index. (((“index aliases”)))(((“aliases, index”))) For instance, instead of naming your index
tweets, name it tweets_v1. Your application would still talk to tweets,
but in reality that would be an alias that points to tweets_v1. This allows
you to switch the alias to point to a newer version of the index on the fly.

A similar technique can be used to expand capacity by adding a new index. It
requires a bit of planning because you will need two aliases: one for
searching and one for indexing:


PUT /tweets_1/_alias/tweets_search PUT /tweets_1/_alias/tweets_index

Both the `tweets_search` and the `tweets_index` alias point to
index `tweets_1`.

New documents should be indexed into tweets_index, and searches should be
performed against tweets_search. For the moment, these two aliases point to
the same index.

When we need extra capacity, we can create a new index called tweets_2 and
update the aliases as follows:


POST /_aliases { “actions”: [ { “add”: { “index”: “tweets_2”, “alias”: “tweets_search” }}, { “remove”: { “index”: “tweets_1”, “alias”: “tweets_index” }}, { “add”: { “index”: “tweets_2”, “alias”: “tweets_index” }} ] }

Add index `tweets_2` to the `tweets_search` alias.
Switch `tweets_index` from `tweets_1` to `tweets_2`.

A search request can target multiple indices, so having the search alias point
to tweets_1 and tweets_2 is perfectly valid. However, indexing requests can
target only a single index. For this reason, we have to switch the index alias
to point to only the new index.

A document GET request, like(((“HTTP methods”, “GET”)))(((“GET method”))) an indexing request, can target only one index.
This makes retrieving a document by ID a bit more complicated in this
scenario. Instead, run a search request with the[`idsquery], or do a((("mget (multi-get) API")))[multi-get] request ontweets_1andtweets_2`.


Using multiple indices to expand index capacity on the fly is of particular
benefit when dealing with time-based data such as logs or social-event
streams, which we discuss in the next section.

上一篇:Replica shards

下一篇:Index per timeframe


需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。