Intro to regex

Meta Characters

Regular expressions have 3 basic meta characters, plus several grouping metacharacters which are used to specify how many times to match a pattern

Meta Char Definition Example
? Match the preceding pattern 0 or 1 times /mea?t/ will match "meat" and "met"
+ Match the preceding pattern 1 or more times /hel+p/ will match "help", "hellp", and "helllllllllp"
* Match the preceding pattern 0 or more times /pou*nd/ will match "pound", "pond", and "pouuuuuund"
{n,m} Match the preceeding pattern at least n, but not more than m times /ad{1,3}/ will match "ad" and "add", and "addd"
{n,} Match the preceeding pattern at least n times /ad{2,}/ will match "add", "addddd" and "adddddddddd"
{,m} Match the preceeding pattern at most m times /ad{,2}/ will match "add", "ad" and "a"
{n} Match the preceeding pattern exactly n times /ad{2}/ will match "add" and not "ad" or "a"

Now that we know how to match the number of occurances of a pattern, we'll need to know how to create a pattern to match.

To give a list of possible characters to match, we group them using [ ]

  /b[aei]t/

This will match "bat","bet", and "bit"

  /[bc]a[rt]/

This will match "bat", "bar", "cat", and "car"

Ranges

We can also specify ranges of characters based on ASCII values. For example:

  /[a-z]_file/

This will match "a_file", b_file" through "z_file" but not "A_file" though "Z_file" and not "aa_file", etc

To match all letters, we use:

  /[a-zA-Z]/

Want to match letters and numbers? Use this:

  /[0-9a-zA-Z]/

Luckily there are some shortcuts for us. They are the following (and on page 246 in your book).

  \d matches digits (i.e /[0-9]/)
  \w matches "word" chars and _ (i.e. /[0-9a-zA-Z_]/)
  \s matches space, tabs, new lines and line returns (i.e /[ \r\n\t])
  \D is the opposite of \d
  \W is the opposite of \w
  \S is the opposite of \s
  . will match any non-new-line character
  ^ in a group ([^abc]) will act as an "anything but" operator

Making it all come together...

So now we know how to 1) match the number of occurances, and 2) how to group characters

What will the following regular expressions match???

  /bo{1,2}[kt]/
  /m[ea]{,2}[lt]/
  /file_[0-9]_j[au]n_[0-9]{4}/
NEXT
PREVIOUS
Master Index