Some ways to optimize query Elasticsearch

Tram Ho

Some ways to optimize query Elasticsearch

Search with as few fields as possible

The more query fields in query_string or multi_match will make the query speed slower. A common technique to improve search speed across multiple fields is to copy their values ​​into a single field at the time of indexing and then use this field to search. This can be automated with copy-to directives. Here is an example:

Pre-index data

You should make use of patterns in your query to optimize how the data is indexed. For example, if all your documents have price fields and most queries run aggregations range in a fixed list (for example, 0 – 10, 10 – 100, 100 -1000, … ), you can perform these aggregations faster by pre-indexing ranges and using terms aggregations to query:

We will search like this:

Or we can add a keyword field to store an array of ranges to index:

And then we will search on the price_range field instead of price :

Preferably use the type keyword when mapping

Not all numeric data must be mapped as numeric. Elaticsearch optimizes numeric fields, such as integer or long for range queries. However, keyword type is better for term queries and some other term-lever queries.

Identifiers, such as a product code or ID, are rarely used in range queries, they are often retrieved by term-level queries.

Consider mapping numeric fields with the type keyword if:

  • You do not intend to use this field to query ranges
  • You need to get data as quickly as possible. The term query on the keyword field is much faster than the term query on the numeric field

If you’re not sure how to use that field, you can use multi-field mapping with both keyword and numeric types:

Avoid script usage

If possible, avoid using search scripts. Because the script does not use index results in slower search speed.

If you often use scripts to convert data already, you can speed up the search by tranforming the data before indexing. However, this means you will spend more time indexing.

An index, my_test_scores, contains two long fields:

  • math_score
  • verbal_score

When running a search, users often use scripts to sort results by the sum of these two field values:

To speed up the search, you can perform this calculation while indexing and add another field to sort.

First, add a new field, Total_score to the index. The Total_score field will contain the sum of the math_score and verbal_score field values.

Next, use a pipeline containing the script to sum math_score and verbal_score and index the value into the Total_score field.

To update existing data, use this pipeline to reindex any document from my_test_scores to an index, for example my_test_scores_2.

Continue to use the pipeline to index any new document to my_test_scores_2.

Finally, the user can sort using the Total_score field instead of using the script:


Share the news now

Source : Viblo