Artificial intelligence (AI) and machine learning (ML) are among the hottest fields in technology today.

The term “AI” is spoken of all around us. You often hear developers who want to learn AI, or what executives they say want to implement AI into their services.

But really, many of us still don’t understand what AI is. After reading this article, you will have a clearer view of AI, ML and more specifically Deep Learning – a small area in Machine Learning. This article is mainly about generalization, so will not have much to do with advanced math.

# Background

Before we learn how Deep Learning works, we have to grasp the difference between the terms already.

## AI vs ML

Artificial intelligence is the bringing of human intelligence into the computer.

When researching AI for the first time, researchers try to recreate human intelligence for specific tasks, like playing a game.

They have a large number of rules that computers must follow. The computer has a wide range of actions that can be performed, and its job is to choose the smartest and correct course of action.

Machine learning refers to the ability that a computer can learn to use large data sets instead of hard-coded principles.

ML allows computers to learn on their own. This learning style takes advantage of the processing power of modern computers and can easily process large data sets.

## Supervised learning vs unsupervised learning

Supervised learning is the use of labeled data sets that have the expected input and output.

When you train an AI model, you give it input, and you tell it the expected outcome. Example: Your younger brother is only 3 years old and he does not know what a dog is, if you let him meet a dog once a day with a different dog shape and you say it is a dog, and then About 10 times like that, can you see it again with a dog with a different shape than before? You show him a dog and tell him he is a dog like supervised learning.

If the AI output is false, then it adjusts its computation over and over on that data set until the AI no longer makes a mistake.

Unattended learning is a task where machine learning uses unstructured data sets.

When you train AI with unsupervised learning, the AI performs logical data classification on its own.

Example for unsupervised learning: When you first subscribe to youtube, the videos they recommend to you are thanks to your account information (such as age, region, …) for rating. you entered a user group previously.

# How does Deep Learning work?

Deep Learning is a small area in ML that allows us to train AI to predict outcomes, based on the input data set that both supervised and unsupervised learning can use.

Let’s learn how it works through an airplane ticket price prediction wallet, we train it using supervised learning:

Fare prediction thanks to the following input information:

- 12Sân bay đi
- 12Sân bay đến
- 12Ngày bay
- 12Hãng máy bay

## Neural network

Let’s see what the “brain” is inside AI.

Like animals, the AI ”brain” has nerve cells, also known as Somatic cells. They are circles in the figure below, cells are connected to each other.

Neural nets are grouped into 3 types of layers:

- Input layers
- Hidden layers
- Output layers

The input layer receives the input data. In the image above, there are 3 neurals in the red region. We have 4 inputs: departure airport, arrival airport, flight date, airline. The input layer receives the data and returns the data to the first hidden layer.

The hidden class performs a computation on the data returned by the input layer. The challenge is to choose the number of hidden layers, and the number of neurals in a hidden layer.

The word “Deep” in “Deep Learning” means that your network has 1 or more hidden layers, the more hidden layers the network is, the deeper the number of neurals in a hidden layer, the wider.

The output layer returns the output, in this case price – the predicted fare.

At a glance, it is like that, and how it calculates a magic way in Deep Learning, let’s take a look.

Each connection of the 2 neurals in the image above is set to be 1 Weight or 1 weight. It represents the importance of the input value, at first the weights are randomly generated. Example: When predicting, the “Date of flight” factor is the most important, so the connections originating from “Flight date” will have a slightly larger weight.

Each neural has an activation function, in order to understand what this function does, you need to be a little deeper in math. Simply understood, it normalizes the output value of each neural accordingly

## Train neural networks

To train an AI in Deep Leaning is quite difficult, you need 2 things:

- 1 large data set
- 1 good computing device

Here, to predict the fares well, we need to find historical data of fares, including many different airlines, so the data is quite large.

To train AI, we supply the input data set and compare its output with the desired output data. Since the AI hasn’t been the best trained yet, the outputs are still wrong.

To evaluate how wrong the AI output compared to our desired output, we will use a cost function called a Cost Function.

Ideally, the Cost Function will return the value 0, at which point our AI output is identical to the desired result.

## How to reduce the cost function

We can change the weights, ie neural net weights at random until the minimum cost function, but this is not very efficient. Instead we use something called Gradient Descent.

Gradient Descent allows us to find the minimum point of a function, here is the Cost Function.

It works by changing the weights (initially randomized) bit by bit after each iteration through the data set. By calculating how the derivative of the Cost Function depends on the weights, we can see how the Cost Function decreases.

We can see, in order to get the weights such that the Cost Function is minimal, we have to iterate over a large data set many times, which takes a great deal of computational power.

Automatically updating the weights is a ‘magic’ thing in Deep Learning.

Once you have the right weights, you put in any new numbers, the AI predicts the price well enough for you.

## Where to learn more?

You can learn more about different types of neural networks to see how they work: Convolutional Neural Network (CNN) in Computer Vision, Recurrent Neural Network (RNN) in Natural Language Processing.

If you want to learn about Deep Learning in a methodical way, you can refer to some online courses.

Currently, the most famous course is Deep Learning Specialization by Professor Andrew Ng (world famous professor of Machine Learning).

# Summary

- Deep Learning uses neural networks to train AI.
- There are 3 types of layers in neural networks: input layer, hidden layer, output layer.
- The more hidden layers the network is, the deeper it becomes, the more difficult it is to train and the more computational resources it takes. In return, the more accurate the results.
- The weights in the network represent the influence of the input value on the output value.
- Each neural uses a trigger function to normalize the value coming out of that neural.
- To train a model you need a large data set.
- Using Gradient Descent repeatedly over the data set to automatically update the weights is a ‘magic’ thing in Deep Learning.