10 practical examples of regex with grep

Filed Under: UNIX/Linux
Grep and regex - practical examples

Grep or Global Regular Expression Print is used to search for text or patterns in a Linux system. It can search in files, directories, and even outputs of other commands.

Regular expressions are patterns that can match text according to user’s needs. These are like rules for pattern matching.

Grep is often used along with regular expressions to search for patterns in text. Let’s see some practical examples of regex with grep.

1. Matching a word irrespective of its case

Sometimes in a text, the same word can be written in different ways. This is most commonly the case with proper nouns. Instead of starting with an uppercase letter, sometimes they are written in all lowercase letters.

$ grep "[Jj]ayant" 
Grep Case

Both the versions of the word, irrespective of their case have been matched.

Another interesting case can be observed with the word ‘IoT’. A word like this might occur several times across the text with different variations. to match all the words irrespective of the case use :

$ grep "[iI][oO][tT]"
Grep Iot 1

2. Matching mobile number using regex with grep

Regular expressions can be used to extract mobile number from a text.

The format of the mobile number has to be known beforehand. For example, a regular expression designed to match mobile numbers won’t work for home telephone numbers.

In this example, mobile number which is in the following format: 91-1234567890 (i.e TwoDigit-TenDigit) will be matched.

$ grep "[[:digit:]]\{2\}[ -]\?[[:digit:]]\{10\}"
Grep Phone Number

As is evident, only the mobile number in the above-mentioned format is matched.

3. Match email-address

Extracting email address out of a text is very useful and can be achieved using grep.

An email address has a particular format. The part before the ‘@’ is the username that identifies the mailbox. Then there is a domain like gmail.com or yahoo.in.

The regular expression can be designed keeping these things in mind.

$ grep -E "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}"
Input File For Email
Input File For Email
Grep Email
grep command on input.txt
  • [A-Za-z0-9._%+-]+ captures the username before ‘@’
  • [A-Za-z0-9.-]+ captures the name of the domain without the ‘.com’ part
  • .[A-Za-z]{2,6} captures the ‘.com’ or ‘.in’ etc.

4. URL checker

A URL has a particular format of representation. A regex can be built that verifies if a URL is in proper form or not.

A URL must start with http/https/ftp followed by ‘://’. Then there is the domain name which can end with ‘.com’, ‘.in’, ‘.org’ etc.

$ grep -E "^(http|https|ftp):[\/]{2}([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4})"
Input Text For Domain 1
Input Text For domain.txt
Grep On Domain 1
Grep On domain.txt

-E used in this example and the previous signifies extended grep which uses Extended Regular Expression set instead of Basic Regular Expression set. This means that certain special characters are not required to be escaped. It makes the process of writing a complex regex less tiresome. Read more about it here.

5. Finding files with a particular extension

ls command displays all the files in current directory.

running ls -l gives extra information regarding the files. Grep can be used along with ls -l command to match pattern in its output.

To grab files that are saved with the extension ‘.txt’ use:

$ ls -l | grep '.txt$' 
Txt File From Ls

6. Find content within parenthesis

Often text files have contents within a parenthesis. We can extract these using regex with grep.

$ grep "([A-Za-z ]*)"
Parenthesis

The regex will pick the text that’s within parathesis. The length of the content within parenthesis can also be specified.

For example, To match parenthesis with only 10 characters use :

$ grep "([A-Za-z ]{10})"

7. Match lines starting with a particular word

We can use regex to find lines that start with a particular word.

Content For Input 1
Contents of input.txt

To find lines starting with the word Apples use :

grep '^Apples' input.txt 
Lines Starting With Apple 1

Similarly, lines starting with any other word can also be found.

We can match lines ending with a specific word using the below regular expressions.

$ grep 'apples.$' input.txt
Grep Endof Line 1

8. Matching multiple words at once

Let’s match multiple words with regex as shown below:

$ grep 'Apples\|Orange' input.txt
Multiple Words 1

This command works line an OR between the two words. It matches lines that contain either of the two words.

To do an AND between the two words use:

$ grep 'Apple' input.txt | grep 'Orange
And Of Matching Words

9. Matching the same word in different forms

Sometimes a word can occur in different forms. They can differ based on the tense they’re used in.

Peeled and Peeling are examples of this. In both the words, the root word is ‘peel’

We can use regex to match all forms of a word.

In our text, we have spelled peeled and peeling as pealed and pealing respectively.

We can also translate from US English to UK English in a similar way. For example, the word color becomes colour.

$ grep 'peal\([a-z]*\)\(\.*[[:space:]]\)' input.txt
Peal Regex 1

10. Fiding users in /etc/passwd file

grep can be used to get users from the /etc/passwd/ file. The /etc/passwd file maintains the list of the users on the system along with some additional information.

$ grep "Adam" /etc/passwd 
Grep Command For Adam

The command uses grep on a system file. When the word “Adam” is found, we can see the line as output. We can perform the same search for any other element in the file.

Conclusion

Regex along with grep command can be very powerful. Regex is studied as a separate field in computer science and can be used to match highly complex patterns. Learn more about regex here.

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages