Brief introduction about Elastic Search. From basic to advanced

Monday, 25/11/2019

Tram Ho

I. Introduction

Searching is certainly one of the very important functions today, especially Big Data is becoming more and more popular. Therefore, to meet the increasing number, there have been many different search methods, and emerged both Elastic Search and Kibana. I have spent a lot of time to study about it, along with a lot of very good things about this platform, in this article, I will share in a simple and open way for a series of articles about Elastic Search from basic to advanced.

II.Elastic Search is what?

Elasticsearch is an open source serch engine that is highly scalable.
It allows us to store and analyze large amounts of realtime information.
Elasticsearch works with JSON documents files. Using a specific internal structure, it can parse your data realtime and can search for any information you want.

In the next articles I will analyze more in depth how things work and the powerful things that Elasticsearch can do, and extremely useful when working with big data. This article is similar to a 101 Introduction Elasticsearch for dummy article, for those who have never worked with it and have not envisioned the processing flow of ElasticSearch.

ElasticSearch is currently available here

III. Definition and some terms

Some information, technical, quite useful and often used about ElasticSearch that I am using.

Reatime distributed and analytics engine. (I left the English, to be the key-word for everyone to search, rather than translate into Vietnamese)
It is open source and is built on Java
Structure is based on documents instead of tables and schema. The biggest benefit I can feel is speed and scalability. It can be implemented by ensuring that we can query as quickly as possible. In terms of scalability, we can run on laptops with hundreds of servers and petabytes of data. Besides speed, and scalability, high resiliency is related to flexibility when it comes to data failure. Therefore, as mentioned above, Elasticsearch is very effective when working with large data, with hundreds of record breaks in realtime. That’s what Elasticsearch does in a very wonderful way.

Regarding analysis, Elasticsearch will provide users with a full log so that we can find and analyze trends based on partern in the data. For example:

Show data with a specific value. For example, show all 23 year old users in the database

Search data by geographic location

Aggregate information by day

IV.Clients

Quite a lot of large clients are using ElasticSearch and cloud service providers have also integrated a lot of services that support ElasticSearch. Some major services are used such as Mozilla, GitHub, Stack Exchange, Netflix

V. How to install

Basically the installation is quite simple, you can follow the guide of the homepage. But as I mentioned above, the ElasticSearch server uses Java so it needs to install Java before install> version 8 Interface

BECAUSE. Interface

ElasticSearch is just a server (backend) so we need the interface to be able to visualize information. And ElasticSearch has provided Kibana to support that. How to install it can also be viewed on the homepage, and quite simply. Interacting with Elastics, there are some supported languages as follows.

Java
C #
Python
JavaScript
PHP
Perl
Ruby

VII. Basic Concepts

Ok, to continue, suppose you have completed the setup of Kibana and ElasticSearch so far. Now, to start, we need to know the basic concept of ElasticSearch

Elasticsearch was created from?

1. Cluster

Cluster is a collection of one or more nodes, and they will store information. It provides the ability to index and search by node and is identified by a unique name. (default path is ‘/ elasticsearch’ /)

2. Node

A node is a single server that includes cluster, stores data, and participates in search and clustering indexing

3. Index

If to say exactly what index in ElasticSearch specifically is quite lengthy and confusing, I will mention in the next article, but to understand the simplest, ElastisSearch’s index is similar to the index of Database, which is which will help us speed up the query, search, update delete etc.

4. Documents

It is a basic information unit (as I read in documents as basic unit of information), which can be indexed. It is denoted by JSON

5. Other ingredients

In addition, in ElasticSearch there are 2 other very important components are Shards and Replicas. It is a bit abstract and easy to cause confusion, so I will repeat it in the next article.

VII. Excuting

Ok, now we will start embarking on initializing and testing Elasticsearch. Once installed and excute them via $ ./elasticsearch terminal

If you use Homebrew, just type elasticsearch in the terminal, the system will run immediately

After running ElasticSearch, the interface will run Interface Kibana, similarly, I will only need to run kibana if I’m using kibana

If everything is correct then we can go to http: // localhost: 9200

To access Kibana, go to the following link http: // localhost: 5601

VII. Commands

In Kibân, we can choose Dev Tools , above the menu. We can call the requests and retrieve the corresponding information we want.

1. Put

The PUT command allows us to add and new document data to ElasticSearch.

PUT /my_playlist/song/ <span class="token number">6</span>
<span class="token punctuation">{</span>
 <span class="token property">"title"</span> <span class="token operator">:</span> <span class="token string">"1000 years"</span> <span class="token punctuation">,</span>
 <span class="token property">"artist"</span> <span class="token operator">:</span> <span class="token string">"Christina Perri"</span> <span class="token punctuation">,</span>
 <span class="token property">"album"</span> <span class="token operator">:</span> <span class="token string">"Breaking Dawn"</span> <span class="token punctuation">,</span>
 <span class="token property">"year"</span> <span class="token operator">:</span> <span class="token number">2011</span>
<span class="token punctuation">}</span>

PUT /my_playlist/song/ 6

{

"title" : "1000 years" ,

"artist" : "Christina Perri" ,

"album" : "Breaking Dawn" ,

"year" : 2011

}

We click the Play button to excute the command

Meaning you have created a document data into Elasticsearch. In this example we have /my_playlist/song/6 :

my_playlist: is the name of the index that you will insert data
song: name of documents created
6: id cuiar element instance. In this case, it is song id

If the index my_playlist does not exist, they will create themselves and so on with song and 6

To UPDATE the value, we use the PUT command with the same document. For example, if you want to add parameter location then we do the following

PUT /my_playlist/song/ <span class="token number">6</span>
<span class="token punctuation">{</span>
 <span class="token property">"title"</span> <span class="token operator">:</span> <span class="token string">"1000 years"</span> <span class="token punctuation">,</span>
 <span class="token property">"artist"</span> <span class="token operator">:</span> <span class="token string">"Christina Perri"</span> <span class="token punctuation">,</span>
 <span class="token property">"album"</span> <span class="token operator">:</span> <span class="token string">"Breaking Dawn"</span> <span class="token punctuation">,</span>
 <span class="token property">"year"</span> <span class="token operator">:</span> <span class="token number">2011</span> <span class="token punctuation">,</span>
 <span class="token property">"location"</span> <span class="token operator">:</span> <span class="token string">"London"</span>
<span class="token punctuation">}</span>

PUT /my_playlist/song/ 6

{

"title" : "1000 years" ,

"artist" : "Christina Perri" ,

"album" : "Breaking Dawn" ,

"year" : 2011 ,

"location" : "London"

}

2. GET

Allows users to obtain information about data GET /my_playlist/song/6

3. Delete

Allow users to delete docuemtn DELETE /my_playlist/song/6

4. Search Data

With Simple Seasrch data, there are 2 basic examples: using URI Search and using Query DSL. With the URI search, we will add params directly into the GET query, for example

Return all accounts from state UT. GET /bank/_search?q=state:UT
Return all accounts from UT or CA. GET /bank/_search?q=state:UT OR CA
Return all accounts from state TN and from female clients. GET /bank/_search?q=state:TN AND gender:F
Return all accounts from people older than 20 years. GET /bank/_search?q=age:>20
Return all accounts from people between 20 and 25 years old. GET /bank/_search?q=age:(>=20 AND <=25)

In addition, we can use more advanced search and query more complex conditions, and speed better than the URI that is DSL. To be able to introduce all about the power of DSL, I would like to introduce it in an upcoming article, more in depth with practical examples.

VII. summary

Through this we can get a glimpse of ElasticSearch, and the basics about it. In the next article, I would like to introduce more DSL Query, Performance, and how to deal with Elastic server.

Share the news now

Source : Viblo