Artificial intelligence, machine learning, and deep learning

Tram Ho

The article introduces the historical context of artificial intelligence, machine learning, and deep learning. How deep learning works at a general level, what deep learning has achieved so far.

First, we need to clearly define what we are talking about when it comes to artificial intelligence (artificial intelligence), machine learning (machine learning), deep learning (deep learning)? How are they related to each other?

Artificial intelligence (AI)

Artificial intelligence was born in the 1950s, when a handful of pioneers in the field of computer science began to question whether it was possible to create computers that could themselves “thinking” or not?

While many of the basic ideas had been formulated over the years and even decades before that, “artificial intelligence” finally crystallized into a field of study in 1956, when John McCarthy , at it was a young assistant professor of mathematics at Dartmouth College, who organized a summer workshop, on the following proposal:

Research is conducted on the basis of the conjecture that all aspects of learning or any characteristic of intelligence can, in principle, be described so accurately that a machines to simulate them. An effort will be made to find ways to make machines able to use language, form abstractions and concepts, solve problems that humans are solving, and improve themselves. . We think that a significant step forward in solving one or more of those problems could be made if a select group of scientists could work together over the summer. At the end of the summer, the workshop ended without addressing all of the problems they had set out to research. However, with the participation of many pioneers in the field, that summer workshop kicked off an intellectual revolution that is still ongoing today.

Strictly speaking, AI can be described as an attempt to automate intellectual tasks normally performed by humans. Artificial intelligence is a field that encompasses both machine learning and deep learning, but also includes many other methodological ways that may not involve any “learning”. Before the 1980s, most AI books didn’t mention learning.

For example, early chess programs were based solely on hard-coded rules created by programmers, and could not be considered machine learning. In fact, for quite a long time, most experts believed that human-level artificial intelligence could be achieved by having programmers manually create large enough sets of rules. to process knowledge stored in databases. This method is known as symbolic AI.

Although symbolic AI is well suited for solving well-described and logic-based problems. We have more complex problems, like image classification, speech recognition, natural language translation. A new method has emerged to take the place of symbolic AI: machine learning.

Machine learning (ML)

In 1843, Ada Lovelace commented on the invention of the Analytical Engine :

Analytical Engine doesn’t have any expectations in creating anything. It can do anything we know how to command it to do…. Its purpose is to assist us in things we already know. Until recently, Lady Lovelace’s views were still interesting. Can a general-purpose computer “initialize” anything on its own, or will it always be bound to execute processes that we humans fully understand? Is it self-taught and creative?

Her remarks were later cited by AI pioneer Alan Turing as “Lady Lovelace’s objections” in his landmark 1950 research paper Computers and Intelligence , the article The study introduced the Turing test as well as the key concepts that will shape AI. At the time, Turing was of the opinion that computers could, in principle, be made to emulate every aspect of human intelligence.

The usual way to get a computer to do a useful job is to have a programmer write down the rules – a computer program that turns input data into suitable answers, just like you do. Ms. Lovelace wrote down step-by-step instructions for the Analytical Engine to do.

Machine learning has turned the problem around, the machine will look at the input data and the corresponding answer provided, and based on that to figure out what the rules are.

A machine learning system is trained, not explicitly programmed. For example, if you wanted to automate the tagging of photos (see what the photo should be tagged), you could create a machine learning system with many examples of already tagged photos. by humans, and based on that data, the system learns statistical rules to associate specific photos with specific tags. Unlike statistics, machine learning tends to deal with large, complex data sets (such as datasets of millions of images, each consisting of tens of thousands of pixels), which statistical analysis Classical like Bayesian analysis would be impractical.

To define deep learning and understand the difference between deep learning and other machine learning methods, let’s first look at what machine learning does.

  • We put in data – For example, if the task is speech recognition, then these data points are audio files that record human voices. If the task is object recognition, these data points are images.
  • Output data we expect from a machine learning system – In speech recognition, the expected result can be a record containing words that appear in an audio file. In the object recognition task, the expected result might be tags like “car”, “human”, and “cat”.
  • Ways to see if an algorithm is performing well – We must consider the difference between the algorithm’s current output data and the output we expect from the algorithm. We can change the variables to adjust the way the algorithm works, these tuning steps we call learning.

A machine learning model transforms input data into meaningful output, a process that “learns” from exposure to known inputs and outputs.

“Deep” in “deep learning” (DL)

Deep learning is a subfield of machine learning. The word “deep” in “deep learning” does not refer to any kind of deeper understanding but rather represents the idea of ​​successively represented layers. The number of layers that contribute to a data model is called the depth of the model. Modern deep learning often involves dozens or even hundreds of successive representation layers, and all of them are learned automatically when exposed to training data.

In deep learning, these representation layers are learned through models called neural networks, which are structured in literally stacked layers. You can think of a deep network as a multi-layered process of information distillation, in which the information passes through successive filters and becomes increasingly pure (i.e. useful for some task). .

Understand how deep learning works

Now, we know that machine learning is about mapping input data (e.g. images) to targets (such as “human” labels), done through observation. many examples of input and target data. You also know that deep neural networks map input data to a target through a series of layers stacked on top of each other, now let’s take a look at how this learning happens in detail.

The specification of what a layer does to its input is stored inside the layer’s weights, which are essentially a series of numbers. (weights are sometimes called layer parameters). In this context, learning means finding a set of values ​​for the weights of all the layers in the network, so that the network will correctly map the input data to their associated targets.

But here’s the problem: a deep neural network can contain tens of millions of parameters. Finding the correct values ​​for all of them can seem like a daunting task, especially when modifying the value of one parameter affects the behavior of all the others!

To control something, you first need to be able to observe it. To control the output of a neural network, you need to be able to measure how far this output differs from what you expect. This is the job of the network’s loss function (also sometimes called the objective function or cost function). The loss function takes the network’s predictions and the real target (what do you want the network to output when given the input data). Then calculate the distance, thereby inferring whether our neural network has achieved high accuracy or not.

We rely on the value of the loss function as a feedback signal to adjust the value of the weights with the desire to lower the value of the loss function. This tuning is the job of the optimizer, which implements what is known as the backpropagation algorithm – an important algorithm in deep learning. The next article explains in more detail how backpropagation works. In the image below we can see, from the calculated loss score, the optimizer will update the weights of the layers. Basically, it will train until the model gives high accuracy, this is called a training loop (usually dozens of iterations over thousands of data).

But that doesn’t mean that after a lot of training the model will get better. It also depends on how we build the model, the parameters passed in, or the data we use to train. But these will be discussed in the following articles.

What has deep learning achieved so far?

Although deep learning is a rather old subfield of machine learning, it only emerged in the early 2010s. In the few years since 2010, it has revolutionized the field, producing results. remarkable for cognitive tasks and even natural language processing tasks—problems involving skills that seem natural and intuitive to humans but have long been elusive catch for machines. In particular, deep learning has made the following breakthroughs, all of which fall within previously difficult areas of machine learning:

  • Image classification, near-human accuracy
  • Significantly improved text-to-speech conversion
  • Digital assistants like Google Assistant, Amazon Alexa and Apple Siri
  • Self-driving cars on a near-human level
  • Ability to answer questions in natural language (recently Chat GPT)
  • We are still discovering the full extent of what deep learning can do. With each milestone, we are getting closer to the age where deep learning helps us in every activity and every field like medicine, manufacturing, transportation, software development, agriculture. and even artistic creation.

End

So we’ve come together to learn about the history of AI, ML, and DL as well as familiarize ourselves with key concepts in deep learning. In the following article, we will briefly learn about machine learning.

In the near future I will be writing a series of articles on deep learning, blogging , based on books by François Chollet (creator of the deep learning library Keras) and courses from universities and Coursera. Brothers and sisters who are interested can bookmark it, the expected progress is one post a week.

Share the news now

Source : Viblo