Regular Expressions, also known simply as Regex, is a girl who only looks at first glance, but it is not very attractive but if you make friends and dig deep inside her, you will realize that she is a the girl is extremely powerful in the processing of data related to strings, difficult or complicated strings, and they also need to accept all the if else guys are queuing out there. Let’s find out about this girl called Regex
1. Uses
The working principle of Regex is to find and compare the base based on a template built from the basic principles it sets.
Regex is not a language, it is a tool that most languages now have, just learn once to use anywhere, what crime does not learn!
Everything from simple to complex Regex can solve the problem of searching and replacing certain patterns, validating a form including dates, emails, passwords … to see if the user entered is valid or not, or more difficult as refactor a complex code …
Just that much is enough to see that Regex is an extremely useful and powerful tool, worth considering as an indispensable luggage in the future.
Despite being so useful and powerful, it’s not always necessary to use Regex. Why do I say that, because Regex is a fairly easy way to learn basic, but if you want to go into it, it’s really not that simple. The purpose of regex is to make everything simple, save time and effort. If you feel that the time it takes to write a paragraph of regex or refactor takes a lot of time compared to the usual way, using the conventional way will save time for you and your colleagues.
2. Declaration
To declare a string Regex we only need to declare it starting with the /
character and ending with the /
character.
For example:
1 2 3 4 | const str = "The lazy dog jumped over the quick brown fox" const regex = /dog/ regex.test(str) //return true |
Or in js, there are other declarations:
1 2 3 4 | const str = "The lazy dog jumped over the quick brown fox" const regex = new RegExp("dog"); regex.test(str) //return true |
3. How to use
Character classes
[abc]
: find all characters within brackets, the order inside brackets does not affect search results
For example, I want to find out all the
or or The
[an]
: find characters froma
ton
d
: any number from 0 to 9w
: one letters
: white space (can be a space, tab, or line …)When the letter is capitalized, its effect is opposite to the above
D
: non-numeric charactersW
: non-letter characterS
: character not a space
For example:
Quantifiers & Alternation
.
: any character except line breaks+
: the character preceding it appears 1 time or more
If you look at it, the results found do not change but pay close attention to the number of matches
: escape characters used in conjunction with some special characters, because in the regex there are a few special characters corresponding to different tasks, so when you want to find those characters, you must combine with escape (. * t n)
^
: inverse character set of characters, used in case you want to check that a string does not contain any characters in the given set
*
: the character preceding it appears> = 0 times (may not appear)
The definition above is quite confusing, for example, when entering a phone number, many people will enter a prefix starting with 0
but there are also people entering a +84
of +84
. Normally, if you use regex /[0-9]+/
will easily miss the +
sign, or you simply want to get the -
sign before negative numbers, for example.
?
: the character before it may or may not appear
Continuing the example above, I got both positive and negative numbers, but if I want to get more fractions, I will do the following
It is essentially the same with the partial *
character, indicating whether the preceding character appears or not but the character ?
only the definition of 1 occurrence is different than *
is multiple times. In the picture above, I received donate in the form of money transfer via account 0451,000.123.456 if using characters ?
only one dot can be obtained; if you want to get the most accurate account number, you must use the *
character.
a{n}
: elementa
appears exactlyn
timesa{n, m}
: elementa
occurs fromn
tom
timesa{n, }
: elementa
appears more than or equal ton
timesab|cd
: find the string ab or cd, in case you have many patterns to check that the given string contains one of those patterns.
For example, a phone number in our country will be of the form 0xx.xxx.xxxx
or +84xx.xxx.xxxx
, I searched on the Internet, I got a few headers of some popular carriers today:
Viettel: 032-039, 086, 096-098
MobiFone: 081-085, 088, 091, 094
VinaPhone: 076-079, 089, 090, 093
Ignoring the case must be the correct prefix of the current network. Usually when writing regex I will split the problem into parts or do the simple parts first. First I will handle the first part, it starts with 0
or +84
and then the 2 numbers corresponding to the operator’s number, then the back part.
Anchors
Following the example in the previous section, if I enter the factor behind or in front of the number, the result is still valid and that is an unexpected result. Regex provides us with two ways to solve this problem.
Use the ^
character to start and end with $
, or use the boundary character b
(word boundary)
Flag
Regular Expression provides a total of 6 flags: i, g, m, s, u, y below I just list out the 3 most used flags
g
: (global) searches for all results, without this flag, the result will contain only the first matching pattern.
i
: (case insensitive) is case-sensitive
m
: (multiline) distinguishes between multiple lines, equivalent to^$
4. Some methods
Kiểm TRA
This is the most basic method used to check if a string contains the given patterns, otherwise it will return true , otherwise false .
1 2 3 4 5 | const regex = /abc/ regex.test('abcd') // return true regex.test('abd') // return false regex.test('aabcc') // return true |
Exec
If the above tets
method only checks to see if the string contains a pattern, the exec
method will return an object that matches the given pattern, if there is no matching object, it will return null
5. Some tools
1. Regexr
This is the tool that I used to take the examples above, it is quite easy to use, nice interface as well as the definition of characters, flags … in addition to saving patterns. You want to use later.
2. Regex101
Regex101 is an equally useful tool with Code Generator
, Quiz
helps you practice more regex.
6. Some common Regex passages
[UPDATING …]