Responsive Advertisement

JavaScript #21: Regex in JavaScript - Part 4

 


Welcome back to this series about Regular expression in JavaScript. If you like this topic, this link for you follow to know more about this content. Series about regular expression

In this article, let's discuss other parts of Regular expression.

Table of contents:

  • Match whitespace
  • Match no whitespace
  • Specify exact number of matches
  • Check for all or none
  • Positive and Negative lookahead
  • Another way to use positive and negative lookahead
  • Practical application

Let’s go to the contents!

Match whitespace

You've already come across the patterns "/\w/" is looking for characters in the range [a-zA-Z0-9] and "\d" is looking for numbers in the range [0-9]. Let's go to the search for whitespace " ". With pattern “\s”.

let str = 'abc 123 $%^ XYZ'

let regex = /\s/g

console.log(str.match(regex))

// => Array [ " ", " ", " " ]

 

In the above string we have lowercase, uppercase, number, special characters, and space. With the pattern “\s” we only take the whitespaces in the above string. 

Match no whitespace

Like "\w" or "\d", "\s" also has "\S" to search for words no whitespace. Let's see the example of how to get the above.

let str = 'abc 12$^XZ'

let regex = /\S/g

console.log(str.match(regex))

// => Array [ "a", "b", "c", "1", "2", "$", "^", "X", "Z" ]

 

If not whitespace is included in the returned results. The result can be lowercase, uppercase, special characters, and number.

Specify exact number of matches

If you have read the article part three about Regular expression, then we have a way to get the number of repeated words with pattern {number, number}. This pattern will request within a given interval. If we want to find a word with exactly how many times a given character is repeated, how do we do it?

let str1 = "tommmmmy"

let str2 = "tommmy"

let str3 = "tommy"

let str4 = "tommmmy"

 

let regex = /tom{3}y/g

console.log(regex.test(str1)) // => false

console.log(regex.test(str2)) // => true

console.log(regex.test(str3)) // => false

console.log(regex.test(str4)) // => false

 

You can see this pattern will be “{number}” where number is the length we want to find. And in front of “{number}” will be characters that need to repeat exactly that length. As in the example above, we can see that words with the letter "m" repeated no more than three times have a return result of false.

We already know the ways to use the pattern “{number1, number2}”. If the pattern is "{number1, number2}", it will be in the range from number1 to number2. If the pattern is “{number,}” it will be repeated at least number of times. And finally, “{number}” will repeat the same number of times. Depending on your specific requirements, you can use it reasonably.

Check for all or none

Let's go into the example, and we will analyze this case.

let str = "British words have neighbour, favour. American words have neighbor and favor."

let regex = /\w+ou?\w*/g

 

console.log(str.match(regex))

// => Array [ "words", "neighbour", "favour", "words", "neighbor", "favor" ]

 

We have the pattern "/\w+ou?\w/", let's analyze this pattern a little bit. We have "\w+" which are characters in the range [a-zA-Z0-9] that repeat at least once. Letter “o” is required, letter “u?” yes or no, “\w*” repeats from zero. Through this pattern we will get the words that have the letter "o" in the middle and the letter "u" which may or may not be present. The returned results are words that follow the pattern that we originally set out.

Positive and Negative lookahead

let str1 = 'qu'

let str2 = 'qt'

let positiveRegex = /q(?=u)/g

let negativeRegex = /q(?!u)/g

 

console.log(str1.match(positiveRegex)) // => Array [ "q" ]

console.log(str2.match(negativeRegex)) // => Array [ "q" ]

 

console.log(str1.match(negativeRegex)) // => null

console.log(str2.match(positiveRegex)) // => null

 

We have the following pattern outside the parentheses “()” inside will be the sign “?=” or “?!”. These marks will be near one or more characters. For the "?=" sign, the meaning is that if the word contains a character in the parentheses "()" it will return the character before the parentheses "()". And for the "?!" then it means if character is not in parentheses “()” will return character outside parentheses.

In the example above we have two words that are “qu” and “qt”, with the two patterns being “/q(?=u)/“ and “/q(?!u)/”. With pattern “/q(?=u)/ “, if the word contains letter “u” then return letter “q” otherwise return null. As for the pattern "/q(?!u)/", if the word does not contain the letter "u", then return the letter "q" if it does not return it.

Now you have distinguished two ways to use “(?=)” and “(?!)”, depending on the specific case that you use them accordingly.

Another way to use positive and negative lookahead

let str1 = 'abcdefgh'

let str2 = 'abcd'

let regexStr = /(?=\w{6})/g

 

console.log(regexStr.test(str1)) // => true

console.log(regexStr.test(str2)) // => false

 

If above, we use the pattern "/(?=)/" in determining whether the word contains the word we specified or not. Then in this part we use the pattern "/(?=)/" with the minimum number of characters that we require.

If you have read Regex in JavaScript – Part 3, we already have a way to determine the minimum length of a word by using the pattern “/{number,}/” with “number” being the minimum number. In this section, we have another way to handle this problem, which is to use the pattern "/(?=)/" combined with "{number}".

As you can see in the example above, we have two words of different lengths. We compare both words with a length of at least 6. If the word length is greater than or equal to 6, it returns true, and vice versa, if the word is less than 6, it returns false.

Let's go to an example of finding numbers to better understand this pattern. Requirement will be words with at least four consecutive numbers in that word.

let str1 = 'user12resu'

let str2 = 'user124resu56'

let str3 = 'user12456resu'

 

let regex = /(?=\D*\d{4})/g

console.log(regex.test(str1)) // => false

console.log(regex.test(str2)) // => false

console.log(regex.test(str3)) // => true

 

In the pattern we see "\D*" which means that the characters no number can appear from zero. And “\d{4}” will be the characters in the string that will appear at least 4 times in a row in the string.

Practical application

To apply the parts that we have learned above, let's go through a real problem. If you work with JavaScript, you probably already know how to use the trim() method. This is the method to remove the whitespace at the beginning and end of the string if any. Now we will apply the knowledge of Regular expression to remove the whitespace at the beginning and the end.

Let's analyze the problem a little bit. We will think about using replace() method in String. Which are we going to replace what with what? We will create a regex that will look for whitespaces at the beginning and end of the string and replace it with "". So we have removed the whitespace at the beginning and end of each string.

Let's build this method together!

String.prototype.myMethodTrim = function () {

    let regex = /^\s+|\s+$/g

    return this.replace(regex, '')

}

 

We will manually define a new method in String called myMethodTrim method. In this method we have a regex pattern "/^\s+|\s+$/g". With “^\s+” being the whitespace appearing at least once in the first position, “\s$” being the whitespace appearing at least once in the last position. We will replace it with "" which means we will delete that whitespace. So we built a trim() method for ourselves.

let str = '      a string has whitespace   '

console.log(str.myMethodTrim())

// => a string has whitespace

 

Through testing we can see that the function myMethodTrim() worked as we expected.

Conclusion

We've gone through other patterns in the regex. We have “\w” and “\W”, (?=) and (?!). And create a trim method for yourself.

If you have any ideas feel free to comment below. Thank you for joining with me. Have a good day! 

Đăng nhận xét

0 Nhận xét