JavaScript Regular Expression Basics
- Transfer
- Tutorial
If you sometimes glance at regular expressions, but still do not dare to master them, thinking that all this is incredibly difficult - you know - you are not alone. For anyone who does not understand what regular expressions are, or who does not understand how they work, they look like complete nonsense.
A powerful picture to attract attention :) Caution, it can suck!
But, in fact, regular expressions are a powerful tool that can help you save a ton of time. In this article, we will cover the basics of regular expressions in JavaScript.
In JavaScript, regular expression is one type of object that is used to search for character combinations in strings.
There are two ways to create regular expressions.
The first is to use regular expression literals. With this approach, the regular expression pattern is enclosed in slashes. It looks like this:
The second involves the constructor of the object
In both of the above examples, the same template is created - a symbol
Which way to create regular expressions to choose? Here you should adhere to this rule: if you intend to use a regular expression so that it remains unchanged, it is better to use a literal. If your regular expression is dynamic, it may change during program execution, it is better to use a constructor
You may have noticed above that regular expressions in JS are objects. Objects are known to have methods, and regular expressions are no exception.
One of the main regex methods is one
Namely, this method returns
Consider the following example. We have two lines and one regular expression. We can use a regular expression to check if a given text pattern occurs in strings:
As expected, when we check the first line,,
Fortunately (or unfortunately - this is someone else), the main approach to the study of regular expressions is to remember the basic constructions denoting characters and groups of characters.
Here is a short list of basic regex constructs. If you are serious about studying them, take some 20 minutes to learn these constructions.
There are five optional regex flags. They can be used together or separately, they are placed after the closing slash. Regular expression flags are as follows:
Before we begin the training project, we dwell in more detail on the practical use of what we just examined.
First, check the line for the presence of any digits in it. In order to do this, we can use a template
As you can see, the code returns
But what if we need to check the string for the presence of a certain sequence of digital characters in it? In this case, you can use the template
As you can see, here we check the string to see if it contains sequences of single digits separated by dashes. The first line matches this pattern, and the second does not.
What if it doesn’t matter exactly how many digits are before or after the dashes, if their number is greater than or equal to one? In such a situation, you can use the sign
In order to make our life easier, we can use parentheses and group expressions with their help. Let's say we want to check if there is something in the line that resembles a cat's meow. To do this, you can use the following construction:
Happened. Now let's look at this expression in more detail. In fact, a lot of interesting things are happening here.
So here is the regex.
As a result, this expression interprets the string as follows:
As you can see, if operators seem to be
Here is another example, it concerns the use of the operator
Take a look at this:
As you can see, each of the expressions returns
Regular expressions are enclosed in slashes. In addition, some characters, such as
In addition, it is important to note that you can use different regular expressions to search for the same string constructs. Here are a couple of examples:
Now it's time to put the knowledge into practice. In our first project, we are going to write a function that takes a string, like
First you need to write a wireframe for a function that takes a string and returns a new string:
Now all we have to do is write a
This regular expression will respond to the letter
In order to do this, we need exciting brackets. In regular expressions, capturing brackets are used to find matches and to remember them. This allows us to use the stored values when we need them. Here's how to work with capturing brackets:
Here you can see that we are using the construct
Note that we do not need to grab the value in parentheses. You can either not use it, or use non-capturing brackets using the view construct
Let's get back to our project. There is an object method
We are already close to a solution, although we have not yet reached our goals. Take a look at our code again. Here we capture the capital letters, then change them to the same letters. And we need to have gaps in front of them. To do this is quite simple - just add a space before the variable
We continue to bring the lines recorded in the camel style to normal. So far this problem has only been partially solved. And right now, we are not happy with the fact that in the final line there is an excessive amount of capital letters.
Now we are going to remove the extra capital letters from the string and replace them with capital letters. Before reading further, reflect on this problem and try to find a solution. However, if you do not succeed, do not be discouraged, since the solution to our problem, although simple, cannot be called very simple.
So, the first thing we need is to select all the capital letters in the string. Here the same construction is used as in the previous example:
Here we will use a method already familiar to you
The method is
If we also use the global search flag, the function will be called for each match with the pattern found in the string. With this in mind, we can use the
This will be our last training project, in which we are going to make the first letter of the processed string capitalized. Here's what we expect from the new feature:
Here, as before, we will use the method
If you add a character to the beginning of the template
We need a special character to
In addition, here we do not use the global search flag, since we need to find only one match with the template. Now all that remains to be done is to convert the found character to uppercase. This can be done using the string method
Now we have everything we need to turn the lines written in CamelStyle into lines, individual words in which are separated by spaces, and which begin with a capital letter, despite the fact that the words inside these lines will be written in capital letters. Here's how sharing your newly created features will look like:
As you can see, although regular expressions look quite unusual for an unprepared person, they can be mastered. The best way to learn regular expressions is practice. We suggest you try the following: write, based on the three functions we created, one that converts the string passed to it, like camelCase, into a regular sentence and adds a period after its last word.
Dear readers! If you managed to write the function that was just discussed, we suggest sharing its code in the comments. In addition, if you are well acquainted with regular expressions, please tell us if this acquaintance helped you in real projects.
Well, the top Habrapostov about regular expressions.
A powerful picture to attract attention :) Caution, it can suck!
But, in fact, regular expressions are a powerful tool that can help you save a ton of time. In this article, we will cover the basics of regular expressions in JavaScript.
Creating Regular Expressions in JS
In JavaScript, regular expression is one type of object that is used to search for character combinations in strings.
There are two ways to create regular expressions.
The first is to use regular expression literals. With this approach, the regular expression pattern is enclosed in slashes. It looks like this:
var regexLiteral = /cat/;
The second involves the constructor of the object
RegExp
, which is passed a string from which it creates a regular expression:var regexConstructor = new RegExp("cat");
In both of the above examples, the same template is created - a symbol
c
, followed by a symbol a
, followed by a symbol t
. Which way to create regular expressions to choose? Here you should adhere to this rule: if you intend to use a regular expression so that it remains unchanged, it is better to use a literal. If your regular expression is dynamic, it may change during program execution, it is better to use a constructor
RegExp
.Regular Expression Methods
You may have noticed above that regular expressions in JS are objects. Objects are known to have methods, and regular expressions are no exception.
One of the main regex methods is one
.test()
that returns a boolean:RegExp.prototype.test()
Namely, this method returns
true
if the string contains a match with the specified regular expression pattern. If no matches are found, it returns false
. Consider the following example. We have two lines and one regular expression. We can use a regular expression to check if a given text pattern occurs in strings:
const str1 = "the cat says meow";
const str2 = "the dog says bark";
const hasCat = /cat/;
hasCat.test(str1);
// true
hasCat.test(str2);
// false
As expected, when we check the first line,,
str1
for the presence of a sequence of characters in it cat
, we get true
. But after checking the second line,, str2
we cat
do not find it, so the method .test()
returns false
.Basic Regular Expression Constructs
Fortunately (or unfortunately - this is someone else), the main approach to the study of regular expressions is to remember the basic constructions denoting characters and groups of characters.
Here is a short list of basic regex constructs. If you are serious about studying them, take some 20 minutes to learn these constructions.
▍ Symbols
.
(period) - matches any single character except line break.*
- matches the previous expression that is repeated 0 or more times.+
- corresponds to the previous expression, which is repeated 1 or more times.?
- matches the previous expression repeated 0 or 1 times.^
- matches the beginning of the line.$
- corresponds to the end of the line.
▍ Character groups
\d
- matches any single numeric character.\w
- matches any character - digit, letter, or underscore.[XYZ]
- a set of characters. Matches any single character from the set specified in brackets. In addition, character ranges can be specified in a similar way, for example -[A-Z]
.[XYZ]+
- matches the character from the brackets repeated one or more times.[^A-Z]
- inside expressions specifying ranges of characters, the character is^
used as a negation sign. In this example, the pattern matches everything that is not uppercase.
▍ Flags
There are five optional regex flags. They can be used together or separately, they are placed after the closing slash. Regular expression flags are as follows:
/[A-Z]/g
. We will consider only two flags here:g
- global search by string.i
- case insensitive search.
▍Additional designs
(x)
- exciting brackets. This expression matchesx
and remembers this match, as a result, we can use it later.(?:x)
- non-capturing brackets. The expression matchesx
but does not remember that matchx(?=y)
- proactive compliance. Matchesx
only if followedy
.
More complex examples of regular expressions
Before we begin the training project, we dwell in more detail on the practical use of what we just examined.
First, check the line for the presence of any digits in it. In order to do this, we can use a template
\d
. Take a look at the code below. It returns true
when there is at least one digit in the string under investigation.console.log(/\d/.test('12-34'));
// true
As you can see, the code returns
true
- this is not surprising, since there are four numeric characters in the line under investigation. But what if we need to check the string for the presence of a certain sequence of digital characters in it? In this case, you can use the template
\d
repeated several times. For example, to ensure that the regular expression matches the string 11
, you can use a construct \d\d
that describes any two consecutive numeric characters. Take a look at this code:console.log(/\d-\d-\d-\d/.test('1-2-3-4'));
// true
console.log(/\d-\d-\d-\d/.test('1-23-4'));
// false
As you can see, here we check the string to see if it contains sequences of single digits separated by dashes. The first line matches this pattern, and the second does not.
What if it doesn’t matter exactly how many digits are before or after the dashes, if their number is greater than or equal to one? In such a situation, you can use the sign
+
to indicate that the pattern /d
can occur one or more times. Here's what it looks like:console.log(/\d+-\d+/.test('12-34'));
// true
console.log(/\d+-\d+/.test('1-234'));
// true
console.log(/\d+-\d+/.test('-34'));
// false
In order to make our life easier, we can use parentheses and group expressions with their help. Let's say we want to check if there is something in the line that resembles a cat's meow. To do this, you can use the following construction:
console.log(/me+(ow)+w/.test('meeeeowowoww'));
// true
Happened. Now let's look at this expression in more detail. In fact, a lot of interesting things are happening here.
So here is the regex.
/me+(ow)+w/
m
- matches a single letterm
.e+
- corresponds to a lettere
repeated one or more times.(ow)+
matches a combinationow
repeated one or more times.w
- matches a single letterw
.
As a result, this expression interprets the string as follows:
'm' + 'eeee' +'owowow' + 'w'
As you can see, if operators seem to be
+
used immediately after expressions enclosed in brackets, they apply to everything that is in brackets. Here is another example, it concerns the use of the operator
?
. A question mark indicates that the presence of the preceding character in the string is optional. Take a look at this:
console.log(/cats? says?/i.test('the Cat says meow'));
// true
console.log(/cats? says?/i.test('the Cats say meow'));
// true
As you can see, each of the expressions returns
true
. This is because we made the characters s
at the end of the sequences cat
and say
optional. You can also notice that there is a flag at the end of the regular expression i
. Thanks to him, when analyzing strings, case of characters is ignored. That is why a regular expression responds to both a string cat
and a string Cat
.About escaping service characters
Regular expressions are enclosed in slashes. In addition, some characters, such as
+
, ?
, and others, have a special meaning. If you need to search these special characters in strings, they must be escaped using a backslash. Here's what it looks like:var slash = /\//;
var qmark = /\?/;
In addition, it is important to note that you can use different regular expressions to search for the same string constructs. Here are a couple of examples:
\d
- this is the same as[0-9]
. Each of these expressions corresponds to any digital character.\w
- this is the same as[A-Za-z0-9_]
. Both will find in the string any single alphanumeric character or underscore.
Project # 1: Add Spaces to Lines Built in Camel Style
Now it's time to put the knowledge into practice. In our first project, we are going to write a function that takes a string, like
CamelCase
, and adds spaces between the individual words of which it consists. Using the ready-made function, which we will call removeCc
, looks like this:removeCc('camelCase') // => возвратит 'camel Case'
First you need to write a wireframe for a function that takes a string and returns a new string:
function removeCc(str){
// вернуть новую строку
}
Now all we have to do is write a
return
construct in the expression of this function that uses regular expressions that process the input data. In order to do this, first you need to find all the uppercase letters in the string, using a construction that specifies a range of characters and provides a global search in the string./[A-Z]/g
This regular expression will respond to the letter
C
in the string camelCase
. And how to add a space before this letter C
? In order to do this, we need exciting brackets. In regular expressions, capturing brackets are used to find matches and to remember them. This allows us to use the stored values when we need them. Here's how to work with capturing brackets:
// Захватывающие скобки
/([A-Z])/
// Работа с сохранённым значением
$1
Here you can see that we are using the construct
$1
to refer to the captured value. It is worth noting that if the expression has two sets of capturing parentheses, you can use expressions $1
and $2
to refer to the values captured in the order they appear from left to right. At the same time, exciting brackets can be used as many times as necessary in a particular situation. Note that we do not need to grab the value in parentheses. You can either not use it, or use non-capturing brackets using the view construct
(?:x)
. In this example, there is a match with x
, but it is not remembered. Let's get back to our project. There is an object method
String
that can be used to work with capturing brackets is this .replace()
. In order to use it, we will search in the string for any capital letters. The second argument to the method representing the replacement value will be the stored value:function removeCc(str){
return str.replace(/([A-Z])/g, '$1');
}
We are already close to a solution, although we have not yet reached our goals. Take a look at our code again. Here we capture the capital letters, then change them to the same letters. And we need to have gaps in front of them. To do this is quite simple - just add a space before the variable
$1
. As a result, there will be a space before each capital letter in the string that the function returns. As a result, we have the following:function removeCc(str){
return str.replace(/([A-Z])/g, ' $1');
}
removeCc('camelCase') // 'camel Case'
removeCc('helloWorldItIsMe') // 'hello World It Is Me'
Project # 2: removing capital letters from a string
We continue to bring the lines recorded in the camel style to normal. So far this problem has only been partially solved. And right now, we are not happy with the fact that in the final line there is an excessive amount of capital letters.
Now we are going to remove the extra capital letters from the string and replace them with capital letters. Before reading further, reflect on this problem and try to find a solution. However, if you do not succeed, do not be discouraged, since the solution to our problem, although simple, cannot be called very simple.
So, the first thing we need is to select all the capital letters in the string. Here the same construction is used as in the previous example:
/[A-Z]/g
Here we will use a method already familiar to you
.replace()
, but this time, when calling this method, we will need something new. Here's what the outline of what we need will look like. Question marks indicate this new, yet unknown, code:function lowerCase(str){
return str.replace(/[A-Z]/g, ???);
}
The method is
.replace()
remarkable in that we can use a function as its second parameter. This function will be called after a match is found, and the fact that this function returns will be used as a string replacing what the regular expression found. If we also use the global search flag, the function will be called for each match with the pattern found in the string. With this in mind, we can use the
.toLowerCase()
object method String
to convert the input string to the form we need. Here's how, in view of the above, the solution to our problem looks like:function lowerCase(str){
return str.replace(/[A-Z]/g, u => u.toLowerCase());
}
lowerCase('camel Case') // 'camel case'
lowerCase('hello World It Is Me') // 'hello world it is me'
Project No. 3: conversion to upper case the first letter of the first word of the line
This will be our last training project, in which we are going to make the first letter of the processed string capitalized. Here's what we expect from the new feature:
capitalize('camel case') // => должна быть возвращена строка 'Camel case'
Here, as before, we will use the method
.replace()
. However, this time we need to find only the very first character of the string. In order to do this, use the symbol ^
. Recall one of the above examples:console.log(/cat/.test('the cat says meow'));
// true
If you add a character to the beginning of the template
^
, true
this construction will not return. This will happen because the word cat
is not at the beginning of the line:console.log(/^cat/.test('the cat says meow'));
// false
We need a special character to
^
act on any lowercase character at the beginning of the line. Therefore, we add it right before the construction [a-z]
. As a result, the regular expression will only respond to the first letter of the string in lowercase:/^[a-z]/
In addition, here we do not use the global search flag, since we need to find only one match with the template. Now all that remains to be done is to convert the found character to uppercase. This can be done using the string method
.toUpperCase()
:function capitalize(str){
return str.replace(/^[a-z]/, u => u.toUpperCase());
}
capitalize('camel case') // 'Camel case'
capitalize('hello world it is me') // 'Hello world it is me'
Sharing previously created features
Now we have everything we need to turn the lines written in CamelStyle into lines, individual words in which are separated by spaces, and which begin with a capital letter, despite the fact that the words inside these lines will be written in capital letters. Here's how sharing your newly created features will look like:
function removeCc(str){
return str.replace(/([A-Z])/g, ' $1');
}
function lowerCase(str){
return str.replace(/[A-Z]/g, u => u.toLowerCase());
}
function capitalize(str){
return str.replace(/^[a-z]/, u => u.toUpperCase());
}
capitalize(lowerCase(removeCc('camelCaseIsFun')));
// "Camel case is fun"
Summary
As you can see, although regular expressions look quite unusual for an unprepared person, they can be mastered. The best way to learn regular expressions is practice. We suggest you try the following: write, based on the three functions we created, one that converts the string passed to it, like camelCase, into a regular sentence and adds a period after its last word.
Dear readers! If you managed to write the function that was just discussed, we suggest sharing its code in the comments. In addition, if you are well acquainted with regular expressions, please tell us if this acquaintance helped you in real projects.
Well, the top Habrapostov about regular expressions.