Responsive Advertisement

JavaScript #19: Regex in JavaScript - Part 2

 



Welcome back to the JavaScript regex series. If you liked this topic, here is the link for you to follow it. Series about regex

In this article, we continue to learn about different patterns in regex.

Table of contents:

  • Match characters that occur one or more time
  • Match characters that occur zero or more time
  • Lazy match solve problem forgot letter in middle
  • Or in pattern regex
  • Pratical application
  • Conclusion

Let’s go to the content!

Match characters that occur one or more time

Firstly, let’s look at the example below:

let str = 'aaaa, aa, a, aaa, aaaaa'

let regex = /a+/g

 

let resultMatch = str.match(regex)

console.log(resultMatch)

// => [ "aaaa", "aa", "a", "aaa", "aaaaa" ]

 

The pattern of this section will be “/a+/”, the “+” sign right after a character. Then, if that character is repeated at least once, it will be returned as a result. In this example, we have letter "a" and "+" sign, just need to repeat letter "a" once to receive the returned result. So, what if "a" was interrupted by other characters? Let's look at the example:

let str = 'aaasaada3aaaa/aaaaa'

let regex = /a+/g

 

let resultMatch = str.match(regex)

console.log(resultMatch)

// => [ "aaa", "aa", "a", "aaaa", "aaaaa" ]

 

The result will only get the character “a” repeated continuously. If interrupted, the search will stop and search for new words. Same goes for other characters that have a “+” right after them.

Match characters that occur zero or more time

Let’s look at example:

let str = 'gaaa, g, gaa, ga, gaaaa'

let regex = /ga*/g

 

let resultMatch = str.match(regex)

console.log(resultMatch)

// => Array [ "gaaa", "g", "gaa", "ga", "gaaaa" ]

 

Based on the pattern above, we have the pattern “/ga*/”. To explain this, we will separate it into two parts, one is the letter "g", the character "g" must be present, and the letter "a" is not necessary. Just contain the letter "g", the results will be received.

For patterns where you receive a "+" sign, the character immediately in front will repeat at least once before the return result is received. As for the pattern with the "*", the character right in front of it is accepted or not. Depending on the purpose of use, you use two signs "+" and "*" accordingly.

Lazy match solve problem forgot letter in middle

Let's come to the example:

let str = "dog, drug, ddrrrmg, derg, dirg, dahg"

let regex = /d[a-z]*g/g

 

let resultMatch = str.match(regex)

console.log(resultMatch)

// => [ "dog", "drug", "ddrrrmg", "derg", "dirg", "dahg" ]

 

If you read part one of the series, you will see the limitation of using “.”. Now we can combine "*" into the characters to be searched. This is a combination of the use of “.” and “+”. Letters don't have to be repeated like "+", nor do they need to return the same length as in a pattern like ".". We can get all the words of different lengths, just match with the pattern we originally set out.

Another way to use a pattern like this is to add a "?" after "*".

let str = "dog, drug, ddrrrmg, derg, dirg, dahg"

let regex = /d[a-z]*?g/g

 

let resultMatch = str.match(regex)

console.log(resultMatch)

// => [ "dog", "drug", "ddrrrmg", "derg", "dirg", "dahg" ]

 

You may see the same result as without the "?". So where is the difference, let's go to another example:

let str = '<a href="#">Home</a>'

let normalRegex = /<.*>/g

let quesRegex = /<.*?>/g

 

let normalResult = str.match(normalRegex)

let quesResult = str.match(quesRegex)

 

console.log(normalResult) // => Array [ "<a href=\"#\">Home</a>" ]

console.log(quesResult) // => Array [ "<a href=\"#\">", "</a>" ]

 

To see the difference, we have an anchor tag in HTM. And two Regular expressions with one without the “?” and one containing a “?”. And the requirement of this pattern is to get everything contained in “<>” which can be letter, number, and special characters. If yes, there is no "?" then the returned result will be the "<" sign at the first position and the ">" sign at the last position. So we have the result Array [ "<a href="#">Home</a>" ]. If there is a "?" then the result will take the "<" sign at the first position and the ">" sign at the nearest position and repeat like that. So we have the result: Array [ "<a href="#">", "</a>" ].

Now you can distinguish when using the "?" and when not using the "?". Depending on the purpose of use, you can apply it in each specific case.

Or in pattern regex

When you want to be able to get multiple words in the same pattern, you can add a "|" sign. between patterns to get more results. To understand better, let's look at an example:

let str = "fog, log, dog, leg, feed, ting, ding, fag, mid"

let regex = /l[a-z]*g|d[a-z]g/ig

 

let resultMatch = str.match(regex)

console.log(resultMatch)

/// => Array [ "log", "dog", "leg" ]

 

In the pattern above, we want to get the words with the letter "l" in the first position and the letter "g" in the last position, and we also want the letter "d" in the first position and letter " g” is in the last position. There are many ways to deal with this problem. However, here we use the "|" sign. to be able to join two patterns together to form a common pattern.

Practical application

Let's go to solve a basic math problem to review the knowledge above. The problem is inspired by tags in HTML. You will have a string of HTML tags. Your job will be to separate open tags and close tags. For example, “<p>”, “<p>”.

let tags = '<h1>Learning about Regular expression</h1>,<p> Regular expression is funny</p>,<a href=/" / ">Learn more</a>'

let regexTags = /<.*?>/ig

 

let resultTags = tags.match(regexTags)

console.log(resultTags)

// => Array [ "<h1>", "</h1>", "<p>", "</p>", "<a href=/\" / \">", "</a>" ]

 

We have the tags variable which is a collection of tags in HTML. We have regexTags, we will take the "<" at the beginning and ">" at the adjacent position, with the "?" right in front. With this pattern we can get the opening and closing tags in a string and use whatever we want.

Conclusion

In this article, we have a "+" sign that is the character next to it that will repeat at least once. The "*" is the preceding character that will repeat from zero. If you add a "?" We will take the character closest to it. You've covered the more advanced parts of regular expressions. In the next article, we will go through the ways of writing shorthand in regular expression.

If you have any comments, feel free to comment below. Thank you for joining with me. Have a good day!


Đăng nhận xét

0 Nhận xét