Learn the basics of Regular Expressions in JavaScript

Tram Ho

What are Regular expressions?

To start learning about regular expressions, let’s first take a look at what are regular expressions?

According to Wikipedia :

Regex is no different than a combination of characters mainly used to find patterns in text or validate user input.

Tools

To make it easier for you to understand, I will give an example. For example, I have an input field and want the user to enter the following format: YYYY/MM/DD : the first is 4 numbers followed by the / sign, followed by 2 numbers, the / sign and the last 2 numbers.

Now when writing regex patterns. There are some great tools out there that can help. Here, I mention 2 tools:

RegExr helps you to display a cheat sheet and allows you to test instantly in real time. This is how I learned regex . Regexer is a great tool that helps you visualize your regex pattern tape a diagram. Going back to the above example, the result is simple:

Before starting to learn, I recommend you to copy-paste the regex examples and visualize it first (try with simple text first)

Now let’s break it down, starting with the basics first. Each regex sample will display 2 / and between the numbers. We can also have a flag after the / . 2 flag universal common sea you are g and i , or a combination of the 2 gi . They mean g lobal (global) and case i nsensitive respectively (not case sensitive).

Let’s say you have a piece of text in which numbers appear multiple times. To select multiple occurrences, you will have to set the g flag. Otherwise, when searching by regex pattern they only find the first occurrence.

Assuming you want to match both javascript and JavaScript text, you must use the i flag. In case you want to match all that text in a piece of text, you must use the gi : /javascript/gi .

Character Class

The regex in the first example is d . This is called a character class – it allows you to tell a regex match to match 1 or a combination of characters. d to select all digits. You can select a set of characters using brackets, you can use [0-9] .

This can also be done with the letter [az] (note this regex only chooses the letter a -> z as lower case), to include the upper case [a-zA-Z] . Or can be a combination of letters and numbers [a-z0-9] .

Quantification and substitution

Continuing, we have {4} after d . This is called the dosing set, and it asks the regex to look for exactly 4 digits. Therefore, / d {4} / g would match 2019, but not 20 19, 20, 201, or anything else that is not four digits long.

You can also define a two-digit range, starting with the smallest number: d{2,4} . This will get numbers that are at least 2 digits long but not longer than 4. You can also ignore the maximum value d{2,} and it will get any number longer than 2. number.

There are also four other alternatives I want to mention as they are often used. The | The (or) operator allows you to specify multiple alternatives. Let’s say you have to write regex for URLs and you need to match both “HTTP” and “WWW”. Putting them together allows you to match one of them: /http|www/g .

The remaining three are actually identical and used to quantify: d* , d+ , d? .

  • d* : An d* is used to match 0 or more preceding characters.
  • d+ : The plus sign is used to match one or more preceding characters.
  • d? : The question mark is used to match 0 or 1 of the preceding character.

Groups

Now, let’s say you use this regex in your JavaScript code, and whenever you find a match, you want to extract a portion of it. In this case, we can retrieve the year, month, and date separately so we can do different content types with them later. In this case groups can be used.

In the original example, when you use the exec method on regex and pass in a date, you get an array back. In this case, you still need to call '2020/01 / 02'.split (' / '); to get what you want.

With the second example, you can solve this problem by grouping everything together with parentheses. Now in the output you get back the year, month, and date separately and you can access them, starting at the first index of the array: arr[1] .

I also include a third example using groups named (year, month, day). This will give you an object groups on the output array, which will contain your named groups with their values. However, this is not standardized and is not supported in all browsers, so I recommend that you avoid using it in production environments.

Practice

Let’s say I want to create a regex match with a web URL and I want it to match “HTTP”, “HTTPS”, “WWW” or no protocol at all. That means I need to cover four different scenarios:

The above regex will match “HTTP” and “HTTPS”. Followed by a colon and two forward slashes.

And now we can finish the rest with the hostname itself:

Now, this will definitely work for the first two cases but we might as well have “WWW” and no protocol at all.

And the only thing to do is make it optional so we have regex when the user doesn’t provide any protocol at all.

Conclude

Above is a general article about regular expressions in Javascript. The article has many shortcomings, you can comment below for me to add. Thank you for watching.

Share the news now

Source : Viblo