Regular expressions have 3 basic meta characters, plus several grouping metacharacters which are used to specify how many times to match a pattern
Meta Char | Definition | Example |
---|---|---|
? | Match the preceding pattern 0 or 1 times | /mea?t/ will match "meat" and "met" |
+ | Match the preceding pattern 1 or more times | /hel+p/ will match "help", "hellp", and "helllllllllp" |
* | Match the preceding pattern 0 or more times | /pou*nd/ will match "pound", "pond", and "pouuuuuund" |
{n,m} | Match the preceeding pattern at least n, but not more than m times | /ad{1,3}/ will match "ad" and "add", and "addd" |
{n,} | Match the preceeding pattern at least n times | /ad{2,}/ will match "add", "addddd" and "adddddddddd" |
{,m} | Match the preceeding pattern at most m times | /ad{,2}/ will match "add", "ad" and "a" |
{n} | Match the preceeding pattern exactly n times | /ad{2}/ will match "add" and not "ad" or "a" |
Now that we know how to match the number of occurances of a pattern, we'll need to know how to create a pattern to match.
To give a list of possible characters to match, we group them using [ ]
/b[aei]t/
This will match "bat","bet", and "bit"
/[bc]a[rt]/
This will match "bat", "bar", "cat", and "car"
We can also specify ranges of characters based on ASCII values. For example:
/[a-z]_file/
This will match "a_file", b_file" through "z_file" but not "A_file" though "Z_file" and not "aa_file", etc
To match all letters, we use:
/[a-zA-Z]/
Want to match letters and numbers? Use this:
/[0-9a-zA-Z]/
Luckily there are some shortcuts for us. They are the following (and on page 246 in your book).
\d matches digits (i.e /[0-9]/) \w matches "word" chars and _ (i.e. /[0-9a-zA-Z_]/) \s matches space, tabs, new lines and line returns (i.e /[ \r\n\t]) \D is the opposite of \d \W is the opposite of \w \S is the opposite of \s . will match any non-new-line character ^ in a group ([^abc]) will act as an "anything but" operator
So now we know how to 1) match the number of occurances, and 2) how to group characters
What will the following regular expressions match???
/bo{1,2}[kt]/ /m[ea]{,2}[lt]/ /file_[0-9]_j[au]n_[0-9]{4}/