Introduction to Regular Expression
What will see in this article:- What is Regular Expression?
- Some simple examples of RE.
What is Regular Expression?
Basically a regular expression is a way to search through a string of text.
cat : the fat cat ran down the street.
at : the fat cat ran down the street.
it was searching for a mouse to eat.
Some Examples:
e+ can be used to search multiple e's in a row:
e : the fat cat ran down the street.
e+ : the fat cat ran down the street.
ea? can be used to search 'e' or 'ea' in a row, because whatever before '?' that is optional:
ea? : the fat cat ran down the street.
It was searching for a mouse to eat.
re* can be used to search 'r' or 're' or 'ree' and so on in a row, as '*' says 'e' could occure zero or more times:
re* : the fat cat ran down the street.
It was searching for a mouse to eat.
.at can be used to search any word that ends with at. "." denotes any character:
.at : the fat cat ran down the street.
It was searching for a mouse to eat.
Use of multiple periods t.., . does not match newline character "\n":
t.. : the fat cat ran down the street.
It was searching for a mouse to eat.
Search for . periods (use escape character "\" before the period ".":
\. : The fat cat ran down the street.
It was searching for a mouse to eat.
Match any word character with "\w". \W is going to match anything that is not a valid character:


\s is going to match anykind of whitespace [\r\t\s] with "\s". The opposite of it \S going to march anything that is not whitespace:


/w{a,b} matches all the words whose length is between a and b:

We can match any 3 char words starts with f or c ends with at by using "[fc]at":

We can match any 3 char words starts with any char a-zA-Z ends with at by using "[a-zA-Z]at":


We can also use parentheses in order to put several groups with or condition. Ex. the following regular expression will search both The and the:

If we want to select 2 or 3 of (t|e|r):

If we want to select 2 or 3 of (t|e|r) ends with a period ".":

To select a group re being occured 2-3 times, use following regular expression:

^ is used to search a pattern at the very beginning of the entire chunk of text. Similarly the $ does the same but at the end of the chunk:


Positive look-behind. Below example search what is behind the The or the by (?<=[t|T]he):

Negative look-behind. Below example search everything that is not behind The or the by (?, means every single character except the two spaces:

Positive look-ahead. Below example search what is ahead of at by .(?=at):

Negative look-ahead. Below example search everything that is not ahead of at by .(?!at), everything except f, c and e which are ahead of at:

Some real applications:
Find a 10 digit phone-number by \d{10} or \d{3}-?\d{3}-?\d{4} or \d{3}[- ]?\d{3}[- ]?\d{4} for numbers have dashes in it:

Group each part of the phone number in a separate group. Use prantheses for grouping Ex (\d{3})[ -]?(\d{3})[ -]?(\d{4}):

How to give a name to a specific group Ex (?

Optional parentheses in the area-code. Ex \(?(\d{3})\)?[ -]?(\d{3})[ -]?(\d{4}):

Add internation code as well with considering the non-capturing gropu property (?:____). Ex (?:(\+1)[ -]?)\(?(\d{3})\)?[ -]?(\d{3})[ -]?(\d{4}):
