Nearly every programmer or systems administrator will use regular expressions at some point in their career; whether it’s as simple as changing a select bit of text in vim, or as advanced as validating detailed bits of information. While it’s not uncommon to think of confusing hieroglyphics of characters when considering regular expressions, if you use Linux or do any programming, then you’ve probably already used them in many of your day-to-day tasks.

For those looking for a more detailed look at regular expressions, we’ve recently released Mastering Regular Expressions, and to give you a taste of the course without giving you anything away, we’re going to learn some basic regex and some basic Perl by building a simple CLI-based regex validator.

The Goal

We want to create a command-line program that will ask us for two things: A string and a regular expression. This program then compares them to see if they match.

But What Are Regular Expressions?

Regular expressions allow us to match patterns and then use those matches for a number of different tasks depending on the tool. Regular expressions can match with literal characters (i.e., an a matches an a); metacharacters like \w, which matches any alphanumeric character; or escaped characters like \., which works as a literal period since the period itself is a metacharacter. We can use additional features like classes, groups, lookarounds, and conditionals to refine our pattern further.

Making the Validator

Perl and regular expressions play very well together; in fact, PCRE, or Perl-Compatible Regular Expressions, is one of the most common regex standards to come across (even if Perl doesn’t implement it 100% the same). As such, learning regex with Perl is a great idea, since there’s not much that Perl can’t do with regular expressions. While programs like sed or grep might limit your ability to use features like conditionals, Perl supports almost all regex features.

So let’s get started crafting this validator! Open whichever text editor you prefer to a blank document; I used vim and named my file regex-val.pl.

  1. Add the hashbang:
     #! /usr/local/bin/perl
    

    If you aren’t sure what you should put here, run which perl on the command line.

  2. We now want to set some variables that should be fed in via STDIN when the script is run. I’m going to call these text for the text we want to validate, and regex for the expression we’re validating against:
     $text = <STDIN>;
     $regex = <STDIN>;
    
  3. We also want to add some prompts so that when we run the script, we know what we’re inputting:
     print "Enter a string: ";
     $text = <STDIN>;
    
     print "Enter a regular expression: ";
     $regex = <STDIN>;
    
  4. Next, we want to check if our text matches our regex, so we need to craft an if statement. To denote something in Perl is a regular expression, we need to encase it in forward slashes:
     if ( $text =~ /$regex/ ) {
     }
    

    This expression is simple enough: We’re saying that if our text equals (=~) our regex, we should run the function in the curly brackets. We haven’t written that yet, but since all we want to do it validate that two things match, we can just add a simple print command:

     if ( $text =~ /$regex/ ) {
       print "Match!\n";
     }
    

    The =~ operator is specifically used for checking a scalar like our text against a pattern match.

  5. Of course, we also want to output a respond if something isn’t a match, so we can just use an else statement:
     if ( $text =~ /$regex/ ) {
       print "Match!\n";
     } else {
       print "Not a match!\n";
     }
    

    This leaves us with the following as the entire script:

     #! /usr/local/bin/perl
    
     print "Enter a string: ";
     $text = <STDIN>;
    
     print "Enter a regular expression: ";
     $regex = <STDIN>;
    
     if ( $text =~ /$regex/ ) {
       print "Match!\n";
     } else {
       print "Not a match!\n";
     }
    
  6. Save the file and make it executable:
     $ chmod +x regex-val.pl
    
  7. Test it out by checking an IP address against the regular expression for an IP address:
     $ ./vali.pl
     Enter a string: 192.54.13.122
     Enter a regular expression: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
     Match!
    

Basic Regular Expressions Cheat Sheet

Want to test some regular expressions without taking a whole course? Here’s a table of metacharacters and features. Try some out!

 

Expression Meaning
\w Match any word-based characters, A-Z, a-z, 0-9
\W Match any non-word character
\d Match any digits, 0-9
\D Match any non-digit
\s Match any whitespace
\s Match any non-whitespace
\t Match any tabs
\n Match any newlines
^ Match the start of a line
$ Match the end of a line
\b Mark a boundary
[ ... ] Set a character class; example: [abC] can match either a, b, or C
[^ ... ] A negated character class; match anything except the letters in the class
( ... ) Group characters together
| When used in a group, it acts as an or
? Mark character optional
+ Repeat the previous character one or more times
* Repeat the previous character zero or more times
. Wildcard

Ready to Learn More?

Then check out our new course on Mastering Regular Expressions and get started today!

One response to “Everyday Regular Expressions”

  1. Sinan Ozdemir says:

    This is awesome.
    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Get actionable training and tech advice

We'll email you our latest articles up to once per week.