Вивчіть регулярні вирази за допомогою цього безкоштовного курсу

“Деякі люди, стикаючись з проблемою, думають:“ Я знаю, я буду використовувати регулярні вирази ”. Зараз у них дві проблеми ". -Джеймі Завінський

Для деяких людей використання регулярних виразів може бути проблемою. Але це не повинно бути проблемою для вас. Ця стаття - повний курс регулярних виразів.

1. Вступ

Регулярні вирази, або просто RegEx, використовуються майже у всіх мовах програмування для визначення шаблону пошуку, який можна використовувати для пошуку речей у рядку.

Я розробив безкоштовний повний відео-курс на Scrimba.com, щоб навчити основ регулярних виразів.

Ця стаття містить курс у письмовій формі. Але якщо ви віддаєте перевагу переглядати відеоверсію з інтерактивними уроками, ви можете ознайомитися з нею на Scrimba. Розділи цієї статті відповідають розділам курсу Ссімба.

Цей курс слід разом із навчальною програмою RegEx на сайті freeCodeCamp.org. Ви можете перевірити це на завдання кодування та отримати сертифікат.

Ці уроки зосереджені на використанні RegEx у JavaScript, але принципи застосовуються у багатьох інших мовах програмування, які ви можете вибрати. Якщо ви ще не знаєте базового JavaScript, може бути корисно, якщо ви трохи його висвітлите. У мене також є базовий курс JavaScript, до якого ви можете отримати доступ на Scrimba та на YouTube-каналі freeCodeCamp.org.

Тож давайте почнемо! Ви врятуєте день в найкоротші терміни. ?

2. Використання методу випробування

Щоб зіставити частини рядків за допомогою RegEx, нам потрібно створити шаблони, які допоможуть вам зробити це співставлення. Ми можемо вказати, що щось є шаблоном RegEx, ставлячи шаблон між косими рисками /, наприклад /pattern-we-want-to-match/.

Давайте розглянемо приклад:

// We want to check the following sentencelet sentence = "The dog chased the cat."
// and this is the pattern we want to match.let regex = /the/

Зверніть увагу, як ми вживаємо, /the/щоб вказати, що ми шукаємо “те” у своєму sentence.

Ми можемо використовувати test()метод RegEx, щоб визначити, присутній шаблон у рядку чи ні.

// String we want to testlet myString = "Hello, World!";
// Pattern we want to findlet myRegex = /Hello/;
// result is now truelet result = myRegex.test(myString);

3. Збіг буквальних рядків

Давайте зараз знайдемо Вальдо.

let waldoIsHiding = "Somewhere Waldo is hiding in this text.";let waldoRegex = /Waldo/;
// test() returns true, so result is now also truelet result = waldoRegex.test(waldoIsHiding);

Зверніть увагу, що в цьому прикладі waldoRegexчутливі до регістру, тому, якщо ми писали /waldo/з малою буквою 'w', тоді наш resultби був хибним.

4. Порівняйте буквальний рядок з різними можливостями

RegEx також має ORоператор, який є |символом.

let petString = "James has a pet cat.";
// We can now try to find if either of the words are in the sentencelet petRegex = /dog|cat|bird|fish/;
let result = petRegex.test(petString);

5. Ігнорувати регістр під час збігу

До цього часу ми розглядали закономірності, коли регістр літер мав значення. Як ми можемо зробити так, щоб наші шаблони RegEx не враховували регістр?

Щоб ігнорувати регістр, ми можемо це зробити, додавши iпрапор в кінці шаблону, наприклад /some-pattern/i.

let myString = "freeCodeCamp";
// We ignore case by using 'i' flaglet fccRegex = /freecodecamp/i;
// result is truelet result = fccRegex.test(myString);

6. Витяг сірників

Коли ми хочемо витягти відповідне значення, ми можемо використовувати match()метод.

let extractStr = "Extract the word 'coding' from this string.";
let codingRegex = /coding/;
let result = extractStr.match(codingRegex);
console.log(result);
// Terminal will show: // > ["coding"]

7. Знайдіть більше, ніж перший матч

Тепер, коли ми знаємо, як витягти одне значення, а також можна витягти кілька значень за допомогою gпрапора

let testStr = "Repeat, Repeat, Repeat";
let ourRegex = /Repeat/g;
testStr.match(ourRegex); // returns ["Repeat", "Repeat", "Repeat"]

Ми також можемо поєднувати gпрапор із iпрапором, щоб виділити кілька збігів та ігнорувати кожух.

let twinkleStar = "Twinkle, twinkle, little star";
let starRegex = /twinkle/ig;// writing /twinkle/gi would have the same result.
let result = twinkleStar.match(starRegex);
console.log(result);
// Terminal will show: // > ["Twinkle", "twinkle"]

8. Зіставте що-небудь із періодом підстановки

У RegEx .є символ підстановки, який би відповідав будь-чому.

let humStr = "I'll hum a song";
let hugStr = "Bear hug";
// Looks for anything with 3 characters beginning with 'hu'let huRegex = /hu./;
humStr.match(huRegex); // Returns ["hum"]
hugStr.match(huRegex); // Returns ["hug"]

9. Зіставте одного персонажа з декількома можливостями

Matching any character is nice, but what if we want to restrict the matching to a predefined set of characters? We can do by using [] inside our RegEx.

If we have /b[aiu]g/, it means that we can match ‘bag’, ‘big’ and ‘bug’.

If we want to extract all the vowels from a sentence, this is how we can do it using RegEx.

let quoteSample = "Beware of bugs in the above code; I have only proved it correct, not tried it.";
let vowelRegex = /[aeiou]/ig;
let result = quoteSample.match(vowelRegex);

10. Match Letters of the Alphabet

But what if we want to match a range of letters? Sure, let’s do that.

let quoteSample = "The quick brown fox jumps over the lazy dog.";
// We can match all the letters from 'a' to 'z', ignoring casing. let alphabetRegex = /[a-z]/ig;
let result = quoteSample.match(alphabetRegex);

11. Match Numbers and Letters of the Alphabet

Letters are good, but what if we also want numbers?

let quoteSample = "Blueberry 3.141592653s are delicious.";
// match numbers between 2 and 6 (both inclusive), // and letters between 'h' and 's'. let myRegex = /[2-6h-s]/ig;
let result = quoteSample.match(myRegex);

12. Match Single Characters Not Specified

Sometimes it’s easier to specify characters that you don’t want to watch. These are called ‘Negated Characters’ and in RegEx you can do it by using ^.

let quoteSample = "3 blind mice.";
// Match everything that is not a number or a vowel. let myRegex = /[^0-9aeiou]/ig;
let result = quoteSample.match(myRegex);// Returns [" ", "b", "l", "n", "d", " ", "m", "c", "."]

13. Match Characters that Occur One or More Times

If you want to match a characters that occurs one or more times, you can use +.

let difficultSpelling = "Mississippi";
let myRegex = /s+/g;
let result = difficultSpelling.match(myRegex);// Returns ["ss", "ss"]

14. Match Characters that Occur Zero or More Times

There is also a * RegEx quantifier. This one matches even 0 occurrences of a character. Why might this be useful? Most of the time it’s usually in combination with other characters. Let’s look at an example.

let soccerWord = "gooooooooal!";
let gPhrase = "gut feeling";
let oPhrase = "over the moon";
// We are trying to match 'g', 'go', 'goo', 'gooo' and so on. let goRegex = /go*/;
soccerWord.match(goRegex); // Returns ["goooooooo"]
gPhrase.match(goRegex); // Returns ["g"]
oPhrase.match(goRegex); // Returns null

15. Find Characters with Lazy Matching

Sometimes your pattern matches can have more than one outcome. For example, let’s say I’m looking for a pattern in a word titanic and my matched values must begin with a ‘t’ and end with an ‘i’. My possible results are ‘titani’ and ‘ti’.

This is why RegEx has a concepts of ‘Greedy Match’ and ‘Lazy Match’.

Greedy match finds thelongest possible match of the string that fits the RegEx pattern, this is a default RegEx match:

let string = "titanic";
let regex = /t[a-z]*i/;
string.match(regex);// Returns ["titani"]

Lazy match finds theshortest possible match of the string that fits the RegEx pattern and to use it we need to use ?:

let string = "titanic";
let regex = /t[a-z]*?i/;
string.match(regex);// Returns ["ti"]

16. Find One or More Criminals in a Hunt

Now let’s have a look at a RegEx challenge. We need to find all the criminals (‘C’) in a crowd. We know that they always stay together and you need to need to write a RegEx that would find them.

let crowd = 'P1P2P3P4P5P6CCCP7P8P9';
let reCriminals = /./; // Change this line
let matchedCriminals = crowd.match(reCriminals);

You can find me walking through the solution in this Scrimba cast.

17. Match Beginning String Patterns

RegEx also allows you to match patterns that are only at the beginning of a string. We’ve already talked about ^ creating a negating set. We can use the same symbol to find a match only at the beginning of a string.

let calAndRicky = "Cal and Ricky both like racing.";
// Match 'Cal' only if it's at the beginning of a string. let calRegex = /^Cal/;
let result = calRegex.test(calAndRicky); // Returns true
let rickyAndCal = "Ricky and Cal both like racing.";
let result = calRegex.test(rickyAndCal); // Returns false

18. Match Ending String Patterns

What about matching a pattern at the end of a string? We can use $ for that.

let caboose = "The last car on a train is the caboose";
// Match 'caboose' if it's at the end of a string.let lastRegex = /caboose$/;
let result = lastRegex.test(caboose); // Returns true

19. Match All Letters and Numbers

Earlier in parts 10 and 11 I showed you how we can match ranges of letters and numbers. If I asked you to write a RegEx that matches all the letters and numbers and ignore their cases you probably would have written something like /[a-z0-9]/gi and that’s exactly right. But it’s a bit too long.

RegEx has something called ‘Shorthand Character Classes’, which is basically a shorthand for common RegEx expression. For matching all letters and numbers we can use \w and we also get underscore _ matched as a bonus.

let quoteSample = "The five boxing wizards jump quickly.";
// Same as /[a-z0-9_]/gi to match a-z (ignore case), 0-9 and _let alphabetRegexV2 = /\w/g;
// The length of all the characters in a string// excluding spaces and the period. let result = quoteSample.match(alphabetRegexV2).length;
// Returns 31

20. Match Everything But Letters and Numbers

If we want to do the opposite and match everything that is not a letter or a number (also exclude underscore _), we can use \W

let quoteSample = "The five boxing wizards jump quickly.";
// Match spaces and the periodlet nonAlphabetRegex = /\W/g;
let result = quoteSample.match(nonAlphabetRegex).length;
// Returns 6

21. Match All Numbers

Ok, what about if you want only numbers? Is there a shorthand character class for that? Sure, it’s \d.

let numString = "Your sandwich will be $5.00";
// Match all the numberslet numRegex = /\d/g;
let result = numString.match(numRegex).length; // Returns 3

22. Match All Non-Numbers

Would you like the opposite and match all the non-numbers? Use \D

let numString = "Your sandwich will be $5.00";
// Match everything that is not a numberlet noNumRegex = /\D/g;
let result = numString.match(noNumRegex).length; // Returns 24

23. Restrict Possible Usernames

So far so good! Well done for making it this far. RegEx can be tricky as it’s not the most easily readable way to code. Let’s now look at a very real-life example and make a username validator. In this case you have 3 requirements:

  • If there are numbers, they must be at the end.
  • Letters can be lowercase and uppercase.
  • At least two characters long. Two-letter names can’t have numbers.

Try to solve this on your own and if you find it difficult or just want to check the answer, check out my solution.

24. Match Whitespace

Can we match all the whitespaces? Of course, we can use a shorthand for that too and it’s \s

let sample = "Whitespace is important in separating words";
// Match all the whitespaceslet countWhiteSpace = /\s/g;
let result = sample.match(countWhiteSpace);
// Returns [" ", " ", " ", " ", " "]

25. Match Non-Whitespace Characters

Can you guess how to match all non-whitespace characters? Well done, it’s \S!

let sample = "Whitespace is important in separating words";
// Match all non-whitespace characterslet countWhiteSpace = /\S/g;
let result = sample.match(countWhiteSpace);

26. Specify Upper and Lower Number of Matches

You can specify the lower and upper number of pattern matches with ‘Quantity Specifiers’. They can be used with {} syntax, for example {3,6}, where 3 is the lower bound and 6 is the upper bound to be matched.

let ohStr = "Ohhh no";
// We want to match 'Oh's that have 3-6 'h' characters in it. let ohRegex = /Oh{3,6} no/;
let result = ohRegex.test(ohStr); // Returns true

27. Specify Only the Lower Number of Matches

When we want to specify only the lower bound, we can do it by omitting the upper bound, for example to match at least three characters we can write {3,}. Notice that we still need a comma, even when we don’t specify the upper limit.

let haStr = "Hazzzzah";
// Match a pattern that contains at least for 'z' characterslet haRegex = /z{4,}/;
let result = haRegex.test(haStr); // Returns true

28. Specify Exact Number of Matches

In the previous section I mentioned that we need a comma in {3,} when we specify only the lower bound. The reason is when you write {3} without a comma, it means that you are looking to match exactly 3 characters.

let timStr = "Timmmmber";
// let timRegex = /Tim{4}ber/;
let result = timRegex.test(timStr); // Returns true

29. Check for All or None

There are times when you might want to specify a possible existence of a character in your pattern. When a letter or a number is optional and we would use ? for that.

// We want to match both British and American English spellings // of the word 'favourite'
let favWord_US = "favorite";let favWord_GB = "favourite";
// We match both 'favorite' and 'favourite' // by specifying that 'u' character is optionallet favRegex = /favou?rite/; // Change this line
let result1 = favRegex.test(favWord_US); // Returns truelet result2 = favRegex.test(favWord_GB); // Returns true

30. Positive and Negative Lookahead

Lookaheads’ are patterns that tell your JS to lookahead to check for patterns further along. They are useful when you’re trying to search for multiple patterns in the same strings. There 2 types of lookaheads — positive and negative.

Positive lookahead uses ?= syntax

let quit = "qu";
// We match 'q' only if it has 'u' after it. let quRegex= /q(?=u)/;
quit.match(quRegex); // Returns ["q"]

Negative lookahead uses ?! syntax

let noquit = "qt";
// We match 'q' only if there is no 'u' after it. let qRegex = /q(?!u)/;
noquit.match(qRegex); // Returns ["q"]

31. Reuse Patterns Using Capture Groups

Let’s imagine we need to capture a repeating pattern.

let repeatStr = "regex regex";
// We want to match letters followed by space and then letterslet repeatRegex = /(\w+)\s(\w+)/;
repeatRegex.test(repeatStr); // Returns true

Instead of repeating (\w+) at the end we can tell RegEx to repeat the pattern, by using \1. So the same as above can be written again as:

let repeatStr = "regex regex";
let repeatRegex = /(\w+)\s\1)/;
repeatRegex.test(repeatStr); // Returns true

32. Use Capture Groups to Search and Replace

When we find a match, it’s sometimes handy to replaced it with something else. We can use replace() method for that.

let wrongText = "The sky is silver.";
let silverRegex = /silver/;
wrongText.replace(silverRegex, "blue");
// Returns "The sky is blue."

33. Remove Whitespace from Start and End

Here’s a little challenge for you. Write a RegEx that would remove any whitespace around the string.

let hello = " Hello, World! ";
let wsRegex = /change/; // Change this line
let result = hello; // Change this line

If you get stuck or just want to check my solution, feel free to have a look at the Scrimba cast where I solve this challenge.

34. Conclusion

Congratulations! You have finished this course! If you’d like to keep learning more, feel free to checkout this YouTube playlist, that has a lot of JavaScript projects you can create.

Keep learning and thanks for reading!

You are now ready to play regex golf. ?