Elasticsearch là gì?
- Full-text search engine.
- NoSQL database.
- Analytics engine.
- Được viết bằng Java.
- Lucence based.
- Inverted indices.
- Dễ scale
- RESTful interface (HTTP/JSON)
- “Schemaless“.
- Real-time.
- ELK stack.
Download Elasticseach.
Bài viết này sử dụng Elastichsearch 7.5
Sau khi download và cài đặt xong, tiến hành chạy Elasticsearch,
các bạn có thể trỏ browser của mình tới http://localhost:9200 (hoặc dùng curl, mình thích dùng curl hơn) để kiểm tra elasticsearch có chạy thành công hay không, và đây là kết quả nhận được:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | <span class="token punctuation">{</span> <span class="token property">"name"</span> <span class="token operator">:</span> <span class="token string">"DESKTOP-IH6ABIE"</span><span class="token punctuation">,</span> <span class="token property">"cluster_name"</span> <span class="token operator">:</span> <span class="token string">"elasticsearch"</span><span class="token punctuation">,</span> <span class="token property">"cluster_uuid"</span> <span class="token operator">:</span> <span class="token string">"m2jnECTRSkyYi6qFD0rNMA"</span><span class="token punctuation">,</span> <span class="token property">"version"</span> <span class="token operator">:</span> <span class="token punctuation">{</span> <span class="token property">"number"</span> <span class="token operator">:</span> <span class="token string">"7.5.2"</span><span class="token punctuation">,</span> <span class="token property">"build_flavor"</span> <span class="token operator">:</span> <span class="token string">"default"</span><span class="token punctuation">,</span> <span class="token property">"build_type"</span> <span class="token operator">:</span> <span class="token string">"tar"</span><span class="token punctuation">,</span> <span class="token property">"build_hash"</span> <span class="token operator">:</span> <span class="token string">"8bec50e1e0ad29dad5653712cf3bb580cd1afcdf"</span><span class="token punctuation">,</span> <span class="token property">"build_date"</span> <span class="token operator">:</span> <span class="token string">"2020-01-15T12:11:52.313576Z"</span><span class="token punctuation">,</span> <span class="token property">"build_snapshot"</span> <span class="token operator">:</span> <span class="token boolean">false</span><span class="token punctuation">,</span> <span class="token property">"lucene_version"</span> <span class="token operator">:</span> <span class="token string">"8.3.0"</span><span class="token punctuation">,</span> <span class="token property">"minimum_wire_compatibility_version"</span> <span class="token operator">:</span> <span class="token string">"6.8.0"</span><span class="token punctuation">,</span> <span class="token property">"minimum_index_compatibility_version"</span> <span class="token operator">:</span> <span class="token string">"6.0.0-beta1"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token property">"tagline"</span> <span class="token operator">:</span> <span class="token string">"You Know, for Search"</span> <span class="token punctuation">}</span> |
Một số khái niệm.
Nếu so sánh với cơ sở dữ liệu quan hệ (RDBMS) thì các thuật ngữ sau có thể hiểu là tương đương.
RDBMS | Elasticsearch |
---|---|
Database | Index |
Table | Type |
Row | Document |
Index.
Để tạo database (hay trong Elasticseach thì gọi là Index) chúng ta sử dụng method PUT cái tên database lên, ví dụ tạo index post:
1 2 3 4 5 6 7 8 9 10 | <span class="token comment"># REQUEST</span> <span class="token constant">PUT</span> <span class="token operator">/</span>post <span class="token comment"># RESPONSE</span> <span class="token punctuation">{</span> <span class="token string">"acknowledged"</span><span class="token punctuation">:</span> <span class="token keyword">true</span><span class="token punctuation">,</span> <span class="token string">"shards_acknowledged"</span><span class="token punctuation">:</span> <span class="token keyword">true</span><span class="token punctuation">,</span> <span class="token string">"index"</span><span class="token punctuation">:</span> <span class="token string">"post"</span> <span class="token punctuation">}</span> |
Document.
Để tạo document, chỉ cần truyền lên một đoạn json, và gán cho nó 1 id
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | <span class="token comment"># REQUEST</span> <span class="token constant">PUT</span> <span class="token operator">/</span>post<span class="token operator">/</span>_doc<span class="token operator">/</span><span class="token number">1</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en-US"</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"Learn Elasticsearch"</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token string">"2020-02-04"</span><span class="token punctuation">,</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token string">"Me!"</span> <span class="token punctuation">}</span> <span class="token comment"># RESPONSE</span> <span class="token punctuation">{</span> <span class="token string">"_index"</span><span class="token punctuation">:</span> <span class="token string">"post"</span><span class="token punctuation">,</span> <span class="token string">"_type"</span><span class="token punctuation">:</span> <span class="token string">"_doc"</span><span class="token punctuation">,</span> <span class="token string">"_id"</span><span class="token punctuation">:</span> <span class="token string">"1"</span><span class="token punctuation">,</span> <span class="token string">"_version"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"result"</span><span class="token punctuation">:</span> <span class="token string">"created"</span><span class="token punctuation">,</span> <span class="token string">"_shards"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"total"</span><span class="token punctuation">:</span> <span class="token number">2</span><span class="token punctuation">,</span> <span class="token string">"successful"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"failed"</span><span class="token punctuation">:</span> <span class="token number">0</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"_seq_no"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"_primary_term"</span><span class="token punctuation">:</span> <span class="token number">1</span> <span class="token punctuation">}</span> |
Ở request trên, post
là tên index, doc
là type, 1
là id.
Nói thêm một chút về type
, trong Elasticsearch
mỗi khi document
được lưu thì sẽ có 1 index
và một mapping type
tương ứng, Ví dụ index
twitter có type user
và tweet
, mỗi type này có thể có các trường riêng, user
có user_name, email, còn tweet
có content, tweeted_at và cũng có user_name.
(Để tạo document chúng ta cũng làm tương tự: PUT /twitter/user/1
, PUT /twitter/tweet/1
Trong Elasticsearch, mọi người thường coi index
như database
trong SQL database, còn type
thì giống với table
, đây là một sự tương đương không tốt và dẫn đến nhiều hệ lụy xấu. Trong SQL database
các table
là độc lập nhau, 2 trường cùng tên ở 2 table
khacs nhau thì không liên quan gì đến nhau. Nhưng trong Elasticsearch
thì không giống vậy, chúng cùng được hỗ trợ bởi một trường Lucence bên trong. Điều này dẫn tới một số hệ quả xấu. Có 2 giải pháp thay thế đó là:
- Mỗi type thì ta cho 1 index riêng.
- Hoặc custome type.
Chính vì vậy cho nên từ Elasticsearch 7.x thì chỉ định type trong API tạo index là không cần thiết nữa.
Từ Elasticsearch 8 thì khai báo type trong API sẽ không được hỗ trợ.
Chi tiết xem ở:
https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html
Trở lại ví dụ, sau khi tạo xong post
, chúng ta có thể lấy thông tin nó bằng method GET
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | <span class="token comment"># REQUEST</span> <span class="token constant">GET</span> <span class="token operator">/</span>post<span class="token operator">/</span>_doc<span class="token operator">/</span><span class="token number">1</span> <span class="token comment"># RESPONSE</span> <span class="token punctuation">{</span> <span class="token string">"_index"</span><span class="token punctuation">:</span> <span class="token string">"post"</span><span class="token punctuation">,</span> <span class="token string">"_type"</span><span class="token punctuation">:</span> <span class="token string">"_doc"</span><span class="token punctuation">,</span> <span class="token string">"_id"</span><span class="token punctuation">:</span> <span class="token string">"1"</span><span class="token punctuation">,</span> <span class="token string">"_version"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"_seq_no"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"_primary_term"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"found"</span><span class="token punctuation">:</span> <span class="token keyword">true</span><span class="token punctuation">,</span> <span class="token string">"_source"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en-US"</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"Learn Elasticsearch"</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span><span class="token string">"Fri, 09 Dec 2019 09:30:27 +0000"</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token string">"Me!"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Mapping.
Đầu bài mình có nói là Elasticsearch schemaless, thực ra không hẳn như vậy. Kiểm tra mapping của index post
trong ví dụ trước:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | <span class="token comment"># REQUEST</span> <span class="token constant">GET</span> post<span class="token operator">/</span>_mapping <span class="token comment"># RESPONSE</span> <span class="token punctuation">{</span> <span class="token string">"post"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"mappings"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"properties"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"keyword"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span><span class="token punctuation">,</span> <span class="token string">"ignore_above"</span><span class="token punctuation">:</span> <span class="token number">256</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"keyword"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span><span class="token punctuation">,</span> <span class="token string">"ignore_above"</span><span class="token punctuation">:</span> <span class="token number">256</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"keyword"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span><span class="token punctuation">,</span> <span class="token string">"ignore_above"</span><span class="token punctuation">:</span> <span class="token number">256</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"keyword"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span><span class="token punctuation">,</span> <span class="token string">"ignore_above"</span><span class="token punctuation">:</span> <span class="token number">256</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
(text: analyzed, keyword: nonanalyzed)
Ta thấy tất cả đều là text, nếu không chỉ rõ thì Elasticsearch
sẽ tự đoán kiểu dữ liệu cho chúng ta.
Như vậy là không ổn lắm, ví dụ các trường cần kiểu ngày/giờ hay số sẽ bị cho là text hết.
Chúng ta tự mapping tại thời điểm khởi tạo Index
như sau, chỉ cần truyền json lên:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <span class="token comment"># REQUEST</span> <span class="token constant">PUT</span> <span class="token operator">/</span>post <span class="token punctuation">{</span> <span class="token string">"mappings"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"properties"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"date"</span><span class="token punctuation">,</span> <span class="token string">"format"</span><span class="token punctuation">:</span> <span class="token string">"E, dd MMM yyyy HH:mm:ss Z"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Analyzers.
Trước tiên hãy xem request mapping sau (tương tự bên trên), khi khai báo analyzer cho trường title.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | <span class="token comment"># REQUEST</span> <span class="token constant">PUT</span> <span class="token operator">/</span>post <span class="token punctuation">{</span> <span class="token string">"mappings"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"properties"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"date"</span><span class="token punctuation">,</span> <span class="token string">"format"</span><span class="token punctuation">:</span> <span class="token string">"yyyy-MM-dd"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"english"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"text"</span><span class="token punctuation">,</span> <span class="token string">"analyzer"</span><span class="token punctuation">:</span> <span class="token string">"english"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"raw"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"keyword"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Analyzer là gì?
Analyzer vs non-analyzer <=> Full-text vs giá_trị_chính_xácAnalyzer
thường có các bước:
- Character filter. (thay thế character)
- Tokenizer. (Bẻ text thành từng term)
- Token filters. (Thêm/xóa/sửa token)
Xem các built-in analyzer của Elasticsearch ở đây https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-analyzers.html
Ví dụ anlyzer:
Hey man, how are you doing?
- Whitespace analyzer: Hey | man, | how | are | you | doing? |
- English analyzer: hei | man | how | you | do |
Test mấy cái analyzer vừa tạo như sau
1 2 3 4 5 6 7 8 9 10 11 12 13 | <span class="token constant">GET</span> post<span class="token operator">/</span>_analyze <span class="token punctuation">{</span> <span class="token string">"field"</span><span class="token punctuation">:</span> <span class="token string">"title.english"</span><span class="token punctuation">,</span> <span class="token string">"text"</span><span class="token punctuation">:</span> <span class="token string">"Hey man, how are you doing?"</span> <span class="token punctuation">}</span> <span class="token comment"># trả về hei man how you do</span> <span class="token constant">GET</span> post<span class="token operator">/</span>_analyze <span class="token punctuation">{</span> <span class="token string">"field"</span><span class="token punctuation">:</span> <span class="token string">"text.raw"</span><span class="token punctuation">,</span> <span class="token string">"text"</span><span class="token punctuation">:</span> <span class="token string">"Hey man, how are you doing?"</span> <span class="token punctuation">}</span> <span class="token comment"># trả về như cũ</span> |
Thực hiện search, giả sử có rất nhiều document, bạn tìm kiếm từ working
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"multi_match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token string">"working"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token string">"title.raw"</span><span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># trả về đúng những title chứa working</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"multi_match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token string">"working"</span><span class="token punctuation">,</span> <span class="token string">"fields"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token string">"title.english"</span><span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># trả về cả nhưng title chứa work, working, working?, .....</span> |
Search
Trước tiên hãy import tập data này vào https://gist.githubusercontent.com/lumosnysm/664e4b76c81eacefaa515c7c1133823c/raw/ebbd60808a868bc3626497d77e3f984747dfd9bb/post.json
1 2 | curl <span class="token operator">-</span><span class="token constant">H</span> <span class="token string">"Content-Type: application/json"</span> <span class="token operator">-</span><span class="token constant">XPOST</span> <span class="token string">"localhost:9200/post/_bulk?pretty&refresh"</span> <span class="token operator">--</span>data<span class="token operator">-</span>binary <span class="token string">"@post.json"</span> |
Để lấy ra toàn bộ document ta sử dụng method GET:
Kết quả trả về được phân trang như sau
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | <span class="token constant">GET</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token comment"># RESPONSE</span> <span class="token punctuation">{</span> <span class="token string">"took"</span><span class="token punctuation">:</span> <span class="token number">13</span><span class="token punctuation">,</span> <span class="token string">"timed_out"</span><span class="token punctuation">:</span> <span class="token keyword">false</span><span class="token punctuation">,</span> <span class="token string">"_shards"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"total"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"successful"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"skipped"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"failed"</span><span class="token punctuation">:</span> <span class="token number">0</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"hits"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"total"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"value"</span><span class="token punctuation">:</span> <span class="token number">963</span><span class="token punctuation">,</span> <span class="token string">"relation"</span><span class="token punctuation">:</span> <span class="token string">"eq"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"max_score"</span><span class="token punctuation">:</span> <span class="token number">1.0</span><span class="token punctuation">,</span> <span class="token string">"hits"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"_index"</span><span class="token punctuation">:</span> <span class="token string">"post"</span><span class="token punctuation">,</span> <span class="token string">"_type"</span><span class="token punctuation">:</span> <span class="token string">"_doc"</span><span class="token punctuation">,</span> <span class="token string">"_id"</span><span class="token punctuation">:</span> <span class="token string">"6581"</span><span class="token punctuation">,</span> <span class="token string">"_score"</span><span class="token punctuation">:</span> <span class="token number">1.0</span><span class="token punctuation">,</span> <span class="token string">"_source"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en"</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"Combell won the Twinkle Award in the “Hosting & Domain” category!"</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token string">"Fri, 09 Dec 2016 09:30:27 +0000"</span><span class="token punctuation">,</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token string">"Combell"</span><span class="token punctuation">,</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token string">"Combell news"</span><span class="token punctuation">,</span> <span class="token string">"award"</span><span class="token punctuation">,</span> <span class="token string">"awards"</span><span class="token punctuation">,</span> <span class="token string">"Combell"</span><span class="token punctuation">,</span> <span class="token string">"twinkle"</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"guid"</span><span class="token punctuation">:</span> <span class="token string">"6581"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> |
hoặc có thể dùng POST và truyền lên đoạn json, kết quả cũng cho giống hệt
1 2 3 4 5 6 7 | <span class="token constant">POST</span> <span class="token operator">/</span>bank<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"match_all"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span><span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
tương tự 2 cách dưới là tương đương nhau
1 2 3 4 5 6 7 8 9 10 11 12 | <span class="token comment"># lấy tất cả post có author "Combell"</span> <span class="token constant">GET</span> bank<span class="token operator">/</span>_search<span class="token operator">?</span>pretty<span class="token operator">&</span>q<span class="token operator">=</span>author<span class="token symbol">:Combell</span> <span class="token constant">POST</span> bank<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token string">"Combell"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Để đếm document, ta sử dụng count
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_count <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"filter"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"term"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"title.raw"</span><span class="token punctuation">:</span> <span class="token string">"Combell won the Twinkle Award in the “Hosting & Domain” category!"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># trả về 1 do search raw sẽ tìm theo chính xác cả câu</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_count <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"Combell won the Twinkle Award in the “Hosting & Domain” category!"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># trả về 546 do search full-text nên tìm theo từng từ</span> |
Filter và Query
Filter:
- Document có match với không? (có hoặc không).
- Không quan tâm đến sự tương quan (relevance).
- Nhanh và cache được.
- Dùng cho trường non-analyzed (như trên thì mình đã để là raw ấy).
Query:
- Document được match có tốt không?
- Full-text search.
- Dùng cho trường được analyzed.
Ví dụ sử dụng filter
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | <span class="token comment"># tìm theo nhiều id</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"filter"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"ids"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"values"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">6515</span><span class="token punctuation">,</span> <span class="token number">6581</span><span class="token punctuation">,</span> <span class="token number">6690</span><span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"filter"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"must"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"term"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"range"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"gte"</span><span class="token punctuation">:</span> <span class="token string">"2016-01-01"</span><span class="token punctuation">,</span> <span class="token string">"format"</span><span class="token punctuation">:</span> <span class="token string">"yyyy-MM-dd"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"must_not"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"term"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token string">"joomla"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"should"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"term"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token string">"Hosting"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"term"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token string">"evangelist"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Ở trên mình dùng must
, must_not
, should
.
Có thể hiểu đơn giản: must
là AND, must_not
là NOT, còn should
là OR.
Relevance
Xem ví dụ sau sử dụng query
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | <span class="token comment"># tìm các post có title 'good news' và language là english</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"must"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"good news"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"filter"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"term"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># RESPONSE</span> <span class="token punctuation">{</span> <span class="token string">"took"</span><span class="token punctuation">:</span> <span class="token number">9</span><span class="token punctuation">,</span> <span class="token string">"timed_out"</span><span class="token punctuation">:</span> <span class="token keyword">false</span><span class="token punctuation">,</span> <span class="token string">"_shards"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"total"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"successful"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"skipped"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"failed"</span><span class="token punctuation">:</span> <span class="token number">0</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"hits"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"total"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"value"</span><span class="token punctuation">:</span> <span class="token number">8</span><span class="token punctuation">,</span> <span class="token string">"relation"</span><span class="token punctuation">:</span> <span class="token string">"eq"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"max_score"</span><span class="token punctuation">:</span> <span class="token number">9.71229</span><span class="token punctuation">,</span> <span class="token string">"hits"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"_index"</span><span class="token punctuation">:</span> <span class="token string">"post"</span><span class="token punctuation">,</span> <span class="token string">"_type"</span><span class="token punctuation">:</span> <span class="token string">"_doc"</span><span class="token punctuation">,</span> <span class="token string">"_id"</span><span class="token punctuation">:</span> <span class="token string">"3707"</span><span class="token punctuation">,</span> <span class="token string">"_score"</span><span class="token punctuation">:</span> <span class="token number">9.71229</span><span class="token punctuation">,</span> <span class="token string">"_source"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en"</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"Good news for you and your Exchange mailbox with Combell"</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token string">"Mon, 16 Dec 2013 13:30:55 +0000"</span><span class="token punctuation">,</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token string">"Romy"</span><span class="token punctuation">,</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token string">"News"</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"guid"</span><span class="token punctuation">:</span> <span class="token string">"3707"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"_index"</span><span class="token punctuation">:</span> <span class="token string">"post"</span><span class="token punctuation">,</span> <span class="token string">"_type"</span><span class="token punctuation">:</span> <span class="token string">"_doc"</span><span class="token punctuation">,</span> <span class="token string">"_id"</span><span class="token punctuation">:</span> <span class="token string">"5895"</span><span class="token punctuation">,</span> <span class="token string">"_score"</span><span class="token punctuation">:</span> <span class="token number">4.979878</span><span class="token punctuation">,</span> <span class="token string">"_source"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"language"</span><span class="token punctuation">:</span> <span class="token string">"en"</span><span class="token punctuation">,</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"Apple.news: where iOS 9’s News app is to be found"</span><span class="token punctuation">,</span> <span class="token string">"date"</span><span class="token punctuation">:</span> <span class="token string">"Fri, 25 Sep 2015 09:56:41 +0000"</span><span class="token punctuation">,</span> <span class="token string">"author"</span><span class="token punctuation">:</span> <span class="token string">"Romy"</span><span class="token punctuation">,</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token string">"Combell news"</span><span class="token punctuation">,</span> <span class="token string">"Domain names"</span><span class="token punctuation">,</span> <span class="token string">"News"</span><span class="token punctuation">,</span> <span class="token string">"Sector news"</span><span class="token punctuation">,</span> <span class="token string">".movie"</span><span class="token punctuation">,</span> <span class="token string">".news"</span><span class="token punctuation">,</span> <span class="token string">".xyz"</span><span class="token punctuation">,</span> <span class="token string">"Apple"</span><span class="token punctuation">,</span> <span class="token string">"apps"</span><span class="token punctuation">,</span> <span class="token string">"new domain names"</span><span class="token punctuation">,</span> <span class="token string">"new tld"</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"guid"</span><span class="token punctuation">:</span> <span class="token string">"5895"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> |
Để ý các phần max_score, _score. Ở doc đầu tiên có chứa cả ‘good’ và ‘news’ nên score là 9.7 cao hơn doc thứ hai là 4.9 khi chỉ chứa từ news. Và kết quả trả về được sắp xếp theo thứ tự điểm từ cao-> thấp
Ngoài ra chúng ta cũng có thể sử dụng như bên dưới để tất cả chung score là 1.0, như vậy ta sẽ thoải mái sắp xếp kết quả từ trường bất kỳ theo ý muốn. Cách này biến Elasticsearch giống như một NoSQL database hơn là một Full-text search engine
1 2 3 4 5 6 7 8 9 10 11 12 | <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"constant_score"</span> <span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"filter"</span> <span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"term"</span> <span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span> <span class="token punctuation">:</span> <span class="token string">"tools"</span><span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Xem một ví dụ khác:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"must"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"good news"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"should"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token string">"apps"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Ở trên có sử dụng should
. Điều đặc biệt là should
khi dùng ở trong query
khác với trong filter
. Trong filter
thì should đơn giản như là phép OR, kết quả sẽ được trả về bất kể việc should
có match hay không. Còn trong query
thì should
có nếu match sẽ boost relevance score của document đó lên.
Như request trên, vẫn giống như cũ, ta tìm kiếm các post mà title chứa ‘good news’, ngoài ra score sẽ được boost thêm nếu document đó có category là ‘apps’. Chạy thử kiểm tra để thấy được doc có id 5895 có category chứa ‘apps’ sau khi chạy request trên có score là 7.7 cao hơn 4.9 khi không tìm với should
.
Ngoài ra, ta có thể tự khai báo boost query theo ý muốn như sau:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | <span class="token comment"># query time boosting</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"bool"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"must"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"good news"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">"should"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token string">"apps"</span><span class="token punctuation">,</span> <span class="token string">"boost"</span><span class="token punctuation">:</span> <span class="token number">3</span> <span class="token comment"># nếu category chứa apps thì boost 3 điểm</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"match"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"category"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"query"</span><span class="token punctuation">:</span> <span class="token string">"Tools"</span><span class="token punctuation">,</span> <span class="token string">"boost"</span><span class="token punctuation">:</span> <span class="token number">2</span> <span class="token comment"># nếu category chứa tools thì boost 2 điểm</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> |
Aggregation
Aggregation cơ bản là group by trong SQL database, nhưng khỏe hơn.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | <span class="token constant">SELECT</span> author <span class="token constant">FROM</span> post <span class="token constant">GROUP</span> <span class="token constant">BY</span> author <span class="token comment"># tương tự với</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"aggs"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"popular_blogers"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"terms"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"field"</span><span class="token punctuation">:</span> <span class="token string">"author"</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># RESPONSE</span> <span class="token string">"aggregations"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"popular_blogers"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"doc_count_error_upper_bound"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"sum_other_doc_count"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"buckets"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"key"</span><span class="token punctuation">:</span> <span class="token string">"Romy"</span><span class="token punctuation">,</span> <span class="token string">"doc_count"</span><span class="token punctuation">:</span> <span class="token number">458</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"key"</span><span class="token punctuation">:</span> <span class="token string">"Jimmy Cappaert"</span><span class="token punctuation">,</span> <span class="token string">"doc_count"</span><span class="token punctuation">:</span> <span class="token number">160</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"key"</span><span class="token punctuation">:</span> <span class="token string">"Tom"</span><span class="token punctuation">,</span> <span class="token string">"doc_count"</span><span class="token punctuation">:</span> <span class="token number">145</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> |
Thậm chí chúng ta có thể query nested để phân tích dữ liệu sâu hơn nữa:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | <span class="token comment"># group theo author, sau đó đếm xem có bao nhiêu bài post ở mỗi ngôn ngữ.</span> <span class="token constant">POST</span> <span class="token operator">/</span>post<span class="token operator">/</span>_search<span class="token operator">?</span>pretty <span class="token punctuation">{</span> <span class="token string">"aggs"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"popular_blogers"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"terms"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"field"</span><span class="token punctuation">:</span> <span class="token string">"author"</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token string">"aggs"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"used_languages"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"terms"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"field"</span><span class="token punctuation">:</span> <span class="token string">"language"</span><span class="token punctuation">,</span> <span class="token string">"size"</span><span class="token punctuation">:</span> <span class="token number">10</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span> <span class="token comment"># RESPONSE</span> <span class="token string">"aggregations"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"popular_blogers"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"doc_count_error_upper_bound"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"sum_other_doc_count"</span><span class="token punctuation">:</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token string">"buckets"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"key"</span><span class="token punctuation">:</span> <span class="token string">"Romy"</span><span class="token punctuation">,</span> <span class="token string">"doc_count"</span><span class="token punctuation">:</span> <span class="token number">458</span><span class="token punctuation">,</span> <span class="token string">"used_languages"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span> <span class="token string">"doc_count_error_upper_bound"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"sum_other_doc_count"</span><span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token string">"buckets"</span><span class="token punctuation">:</span> <span class="token punctuation">[</span> <span class="token punctuation">{</span> <span class="token string">"key"</span><span class="token punctuation">:</span> <span class="token string">"en"</span><span class="token punctuation">,</span> <span class="token string">"doc_count"</span><span class="token punctuation">:</span> <span class="token number">284</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">{</span> <span class="token string">"key"</span><span class="token punctuation">:</span> <span class="token string">"nl"</span><span class="token punctuation">,</span> <span class="token string">"doc_count"</span><span class="token punctuation">:</span> <span class="token number">174</span> <span class="token punctuation">}</span> <span class="token punctuation">]</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span> |
Như vậy là mình đã điểm qua một số thứ cơ bản trong Elasticsearch, hy vọng nó sẽ giúp ích cho các bạn.
Tham khảo
https://www.elastic.co/
https://github.com/ThijsFeryn/elasticsearch_tutorial