Machine Learning approach for beginners

Tram Ho

Note : English terms do not translate into Vietnamese because it is not literal, and to English will help readers become familiar with English terms, making it easier to access English documents later. than.

TL; DR

Introduce a good way to approach Machine learning (ML) easily and introduce through the ML basics solved.

Who is this article for?

  • People who do not know anything about Machine Learning
  • Beginners learn about Machine Learning
  • Machine Learning learners forever but do not understand
  • Who knows Machine Learning already but not confident
  • Who wants to learn Machine Learning seriously

Obstacles when learning ML

To have the right approach, first, we consider what are the obstacles first.

Machine learning is taught in universities for research purposes, Machine learning is very dry and uses a lot of math. The mathematics can be followed by linear algebra (Probability theory and statistic), Probability theory and statistic, Calculus, Algorithm and optimization, etc. ML is built on deep and complex mathematical foundations. So, by default, you need a strong math background to overcome this obstacle.

machine learning learning obstacles

However, the nature of mathematics does not make a normal thing more sublime, but rather, it is a tool for us to perform more compact and easier to dig deeper. Therefore, everything complicated in ML including the math section can be explained in a simple way. When simplified, anyone can understand and overcome this obstacle.

“If you can’t explain it simply, you don’t understand it well enough” _Einstein

The world has many people re-express mathematical concepts in an extremely easy to understand, though those concepts are very complex. These include 3Blue1Brown or BetterExplained who are excellent at simplifying concepts, using images to create visuals, and using known similarities to act on unknown things. They do excellent.

Machine Learning can certainly be expressed in a way that is easy to understand and anyone can learn, so the complexity of ML is not a big obstacle.

machine learning learning obstacles

The remaining problem is the ML approach of learners, if the unscientific approach can cause readers to be discouraged and difficult.

Approach

Through talking with many of my friends and many other ML learners, I learned two main ways to approach ML:

  • Option 1: Where learning is sure to come , the learner wants to master each part and then move on to another part. After studying ML, I would study algebra, finish analysis, then move to statistical probability … and then finally move on to study models, regression, etc. This way helps learners master the basics and build a good foundation. This is also a method of formal training of universities. However, this way must invest time and effort as well as perseverance in pursuit.
  • Method 2: From general to specific , also known as top-down, for this way, learners will not require a solid mathematical background from the beginning to approach the problem, from the beginning, the person learn ML application to solve problems, finally go into specific theory behind. With this approach, from the beginning, we were happy to think of ML as something that solves a problem. That helps keep our minds informed, it drives our curiosity to learn more details to the extent we desire.

The method from general to specific accept considers an unknown as a black box. The black box has input, output and its functions. Just need to know that we have applied the black box to solve the problem. Compared to method 1, the way 2 is more active, approach faster and easier. Especially suitable for people who are not in the industry, want to start with ML without basic.

ml approach

As described in the illustration above, approaching ML by the general to specific method will include many levels. After each level, the picture of ML gradually becomes clearer and at any level, we have an overview of the overall ML. As follows:

  • The first level is an overview of the overall ML picture from the top. There are 5 levels of awareness: (1) not knowing what we don’t yet know; (2) know what we don’t know yet; (3) know what we want to know; (4) know what we already know; and (5) not knowing what we already know – the superfluous level. Before the first level, we at (1) do not know anything in ML, do not even know the existence of ML, after the first level, awareness of ML at (2) and (3) is knowing what we don’t know yet we know what we want to know. This is extremely important!
  • The next level can be mentioned is to use the tools available to try the operation of some models in ML, to know what models solve the problems, where to apply them. At this level, we can dig a little deeper about custom models, such as changing parameters, to examine how the model depends on the parameters. The most popular tool for using models today is Python Sklearn, for more complex models, there are frameworks like Pytorch, Tensorflow 2.0, Keras, Cafe, even Matlab. At this level, learners can participate in hackathon, kaggle competitions to interact and rub. Learners can also catch up with articles like “AI noodles”.
  • The next level is further to deepen the theory behind. Note that at the first 2 levels the learner is only a user of the tool. However, over the first 2 levels, it provides us with certain impressions that help us easily grasp the theory and explain the meaning of each component in the theory. At this level we can catch-up with the AI ​​situation in the world, read updates on new models (SOTA) of other researchers in the world. https://paperswithcode.com/ is a pretty good site to follow. You can also subscribe to AI newsletters like https://jack-clark.net/ to get updates on AI details weekly.
  • The final level is to understand thoroughly the properties of each component in the model, understand how it affects the model, why people choose it, change the components to fit the data, build similar models. to solve other problems, etc. This level is equivalent to the level of the scientist.

If someone had shown me this method in the first place, I probably wouldn’t have wasted more than a year. Therefore, I hope this method will be useful to you and help you not to waste time like me.

Overview of ML

In this article, in addition to presenting the easy ML approach. I present a little briefly about ML.

ML is the algorithm of the algorithm.

Our world has a wealth of information (data) D, as well as (Problem) P. In order to solve the P problem, we use our brain to analyze data D to draw rules (Rule). ) R, then put these R rules into the algorithm (Algorithm) A. This is the classic way.

ML is a shortened step: from data D we create a special algorithm called Model (M) model, M performs observation (train) on data D information and sets the R rules by itself. No human needed. In a nutshell, ML is the creation of model M that can be learned from data D to solve the problem P. When mentioning ML, it is necessary to mention data.

People often divide ML into categories according to the way of learning of model: Supervised Learning, Unsupervised Learning, Semi-Supervised learning, Reinforcement Learning. However, split by function will be easier to remember. In the car repair toolbox can include pliers, wrench, screwdriver, pump. We remember them not by what material they are made or by how they are made. We remember them most easily by remembering their uses. Dividing according to utility is the easiest to get into people.

ml cheatsheet

In real life, cheatsheet like this is like coffee, normally like high efficiency! Download the pdf here

Note for data, in data will have many samples (sample) corresponding to 1 data point in the data. Each data point will contain many unique information, each of which is counted as one data dimension. For example, MNIST is a handwritten data set consisting of 70 thousand images of size 28×28, so each image in mnist will be 1 data point, each pixel is one dimension of data. Each data point has 28 * 28 = 784 data dimensions.

Machine learning divided by functions includes:

  • Classification: classification model, for example the data is a dog or cat image set that has been categorized as a ảnh chó or ảnh mèo , for a new photo, this new image classification is a dog or cat image.
  • Regression: the model finds the relationship between data points. For example, data is the data of house prices in recent years as well as parameters of that house, the problem is to provide information about a new house, predict the price of that house. Information on the house and house prices will have some relationship, such as linear, we can apply model linear regression for this problem to find specific parameters.
  • Recommendation: The model solves the problem of a proposal, a common example is sales websites that offer similar products for you.
  • Clustering: Clustering model, for example, for a group of people of a certain height, grouping into K groups so that people of the same group have the least height difference, finding K and cities K members.

The 4 most basic problems of ML, but in these 4 problems, each problem has several to a dozen different models. Each model has advantages and disadvantages. Don’t be overwhelmed by too many models, remember our approach is general in detail, we remember the model name and the problem it solves first, accepting it as a black box. We will gradually crack those black boxes in the next sections.

Conclude

So we know how to approach ML effectively, have an overview of the problems that ML can solve as well as the names of models to solve them. If readers have a faster, more effective approach, please share.

Source: hocmachinelearning.com

Share the news now

Source : Viblo