CINXE.COM
Regular Expression Tutorial - Learn How to Use Regular Expressions
<!DOCTYPE html> <html lang="en"><head><meta charset="utf-8"><link rel=canonical href='https://https://www.regular-expressions.info//tutorial.html'><title>Regular Expression Tutorial - Learn How to Use Regular Expressions</title> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="author" content="Jan Goyvaerts"> <meta name="description" content="This tutorial teaches you how to create your own regular expressions, starting with the most basic regex concepts and ending with the most advanced and specialized capabilities."> <meta name="keywords" content="regex, regular expression, regular expressions, tutorial, learn, study, regexp, regexes"> <link rel=stylesheet href="regex.css" type="text/css"><script src="theme.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="New at Regular-Expressions.info" href="updates.xml"> </head> <body bgcolor=white text=black> <div id=top></div> <div id=btntop><div id=btngrid><a href="quickstart.html" target="_top"><div>Quick Start</div></a><a href="tutorial.html" target="_top"><div>Tutorial</div></a><a href="tools.html" target="_top"><div>Tools & Languages</div></a><a href="examples.html" target="_top"><div>Examples</div></a><a href="refflavors.html" target="_top"><div>Reference</div></a><a href="books.html" target="_top"><div>Book Reviews</div></a></div></div> <div id=contents><div id=side> <TABLE CLASS=side CELLSPACING=0 CELLPADDING=4><TR><TD CLASS=sideheader>Regex Tutorial</TD></TR><TR><TD><A HREF="tutorial.html" TARGET=_top>Introduction</A></TD></TR><TR><TD><A HREF="tutorialcnt.html" TARGET=_top>Table of Contents</A></TD></TR><TR><TD><A HREF="characters.html" TARGET=_top>Special Characters</A></TD></TR><TR><TD><A HREF="nonprint.html" TARGET=_top>Non-Printable Characters</A></TD></TR><TR><TD><A HREF="engine.html" TARGET=_top>Regex Engine Internals</A></TD></TR><TR><TD><A HREF="charclass.html" TARGET=_top>Character Classes</A></TD></TR><TR><TD><A HREF="charclasssubtract.html" TARGET=_top>Character Class Subtraction</A></TD></TR><TR><TD><A HREF="charclassintersect.html" TARGET=_top>Character Class Intersection</A></TD></TR><TR><TD><A HREF="shorthand.html" TARGET=_top>Shorthand Character Classes</A></TD></TR><TR><TD><A HREF="dot.html" TARGET=_top>Dot</A></TD></TR><TR><TD><A HREF="anchors.html" TARGET=_top>Anchors</A></TD></TR><TR><TD><A HREF="wordboundaries.html" TARGET=_top>Word Boundaries</A></TD></TR><TR><TD><A HREF="alternation.html" TARGET=_top>Alternation</A></TD></TR><TR><TD><A HREF="optional.html" TARGET=_top>Optional Items</A></TD></TR><TR><TD><A HREF="repeat.html" TARGET=_top>Repetition</A></TD></TR><TR><TD><A HREF="brackets.html" TARGET=_top>Grouping & Capturing</A></TD></TR><TR><TD><A HREF="backref.html" TARGET=_top>Backreferences</A></TD></TR><TR><TD><A HREF="backref2.html" TARGET=_top>Backreferences, part 2</A></TD></TR><TR><TD><A HREF="named.html" TARGET=_top>Named Groups</A></TD></TR><TR><TD><A HREF="backrefrel.html" TARGET=_top>Relative Backreferences</A></TD></TR><TR><TD><A HREF="branchreset.html" TARGET=_top>Branch Reset Groups</A></TD></TR><TR><TD><A HREF="freespacing.html" TARGET=_top>Free-Spacing & Comments</A></TD></TR><TR><TD><A HREF="unicode.html" TARGET=_top>Unicode</A></TD></TR><TR><TD><A HREF="modifiers.html" TARGET=_top>Mode Modifiers</A></TD></TR><TR><TD><A HREF="atomic.html" TARGET=_top>Atomic Grouping</A></TD></TR><TR><TD><A HREF="possessive.html" TARGET=_top>Possessive Quantifiers</A></TD></TR><TR><TD><A HREF="lookaround.html" TARGET=_top>Lookahead & Lookbehind</A></TD></TR><TR><TD><A HREF="lookaround2.html" TARGET=_top>Lookaround, part 2</A></TD></TR><TR><TD><A HREF="keep.html" TARGET=_top>Keep Text out of The Match</A></TD></TR><TR><TD><A HREF="conditional.html" TARGET=_top>Conditionals</A></TD></TR><TR><TD><A HREF="balancing.html" TARGET=_top>Balancing Groups</A></TD></TR><TR><TD><A HREF="recurse.html" TARGET=_top>Recursion</A></TD></TR><TR><TD><A HREF="subroutine.html" TARGET=_top>Subroutines</A></TD></TR><TR><TD><A HREF="recurseinfinite.html" TARGET=_top>Infinite Recursion</A></TD></TR><TR><TD><A HREF="recurserepeat.html" TARGET=_top>Recursion & Quantifiers</A></TD></TR><TR><TD><A HREF="recursecapture.html" TARGET=_top>Recursion & Capturing</A></TD></TR><TR><TD><A HREF="recursebackref.html" TARGET=_top>Recursion & Backreferences</A></TD></TR><TR><TD><A HREF="recursebacktrack.html" TARGET=_top>Recursion & Backtracking</A></TD></TR><TR><TD><A HREF="posixbrackets.html" TARGET=_top>POSIX Bracket Expressions</A></TD></TR><TR><TD><A HREF="zerolength.html" TARGET=_top>Zero-Length Matches</A></TD></TR><TR><TD><A HREF="continue.html" TARGET=_top>Continuing Matches</A></TD></TR> </TABLE><TABLE CLASS=side CELLSPACING=0 CELLPADDING=4><TR><TD CLASS=sideheader>More on This Site</TD></TR><TR><TD><A HREF="index.html" TARGET=_top>Introduction</A></TD></TR><TR><TD><A HREF="quickstart.html" TARGET=_top>Regular Expressions Quick Start</A></TD></TR><TR><TD><A HREF="tutorial.html" TARGET=_top>Regular Expressions Tutorial</A></TD></TR><TR><TD><A HREF="replacetutorial.html" TARGET=_top>Replacement Strings Tutorial</A></TD></TR><TR><TD><A HREF="tools.html" TARGET=_top>Applications and Languages</A></TD></TR><TR><TD><A HREF="examples.html" TARGET=_top>Regular Expressions Examples</A></TD></TR><TR><TD><A HREF="refflavors.html" TARGET=_top>Regular Expressions Reference</A></TD></TR><TR><TD><A HREF="refreplace.html" TARGET=_top>Replacement Strings Reference</A></TD></TR><TR><TD><A HREF="books.html" TARGET=_top>Book Reviews</A></TD></TR><TR><TD><A HREF="print.html" TARGET=_top>Printable PDF</A></TD></TR><TR><TD><A HREF="about.html" TARGET=_top>About This Site</A></TD></TR><TR><TD><A HREF="updates.html" TARGET=_top>RSS Feed & Blog</A></TD></TR></TABLE></DIV><div class=bodytext><div class=topad style="height:130px"><A HREF="https://www.regexbuddy.com/create.html" TARGET="_top"><picture><source media="(max-width: 370px)" srcset="ads/320/rxbtutorial100.png 1x, ads/320/rxbtutorial150.png 1.5x, ads/320/rxbtutorial200.png 2x, ads/320/rxbtutorial250.png 2.5x, ads/320/rxbtutorial300.png 3x, ads/320/rxbtutorial350.png 3.5x, ads/320/rxbtutorial400.png 4x"><source media="(max-width: 500px)" srcset="ads/360/rxbtutorial100.png 1x, ads/360/rxbtutorial150.png 1.5x, ads/360/rxbtutorial200.png 2x, ads/360/rxbtutorial250.png 2.5x, ads/360/rxbtutorial300.png 3x, ads/360/rxbtutorial350.png 3.5x, ads/360/rxbtutorial400.png 4x"><source media="(max-width: 660px)" srcset="ads/480/rxbtutorial100.png 1x, ads/480/rxbtutorial150.png 1.5x, ads/480/rxbtutorial200.png 2x, ads/480/rxbtutorial250.png 2.5x, ads/480/rxbtutorial300.png 3x, ads/480/rxbtutorial350.png 3.5x, ads/480/rxbtutorial400.png 4x"><source media="(max-width: 747px)" srcset="ads/640/rxbtutorial100.png 1x, ads/640/rxbtutorial150.png 1.5x, ads/640/rxbtutorial200.png 2x, ads/640/rxbtutorial250.png 2.5x, ads/640/rxbtutorial300.png 3x, ads/640/rxbtutorial350.png 3.5x, ads/640/rxbtutorial400.png 4x"><img src="ads/728/rxbtutorial100.png" srcset="ads/728/rxbtutorial100.png 1x, ads/728/rxbtutorial125.png 1.25x, ads/728/rxbtutorial150.png 1.5x, ads/728/rxbtutorial175.png 1.75x, ads/728/rxbtutorial200.png 2x, ads/728/rxbtutorial250.png 2.5x, ads/728/rxbtutorial300.png 3x, ads/728/rxbtutorial350.png 3.5x, ads/728/rxbtutorial400.png 4x" alt="RegexBuddy—Better than a regular expression tutorial!"></picture></A></div> <div class=bulb><h1>Regular Expressions Tutorial<br>Learn How to Use and Get The Most out of Regular Expressions</h1><script type="text/javascript">showbulb();</script></div> <p>This tutorial teaches you all you need to know to be able to craft powerful time-saving regular expressions. It starts with the most basic concepts, so that you can follow this tutorial even if you know nothing at all about regular expressions yet.</p> <p>The tutorial doesn’t stop there. It also explains how a regular expression engine works on the inside and alerts you to the consequences. This helps you to quickly understand why a particular regex does not do what you initially expected. It will save you lots of guesswork and head scratching when you need to write more complex regexes.</p> <h2>What Regular Expressions Are Exactly - Terminology</h2> <p>Basically, a regular expression is a pattern describing a certain amount of text. Their name comes from the mathematical theory on which they are based. But we will not dig into that. You will usually find the name abbreviated to "regex" or "regexp". This tutorial uses "regex", because it is easy to pronounce the plural "regexes". On this website, regular expressions are shaded gray as <TT CLASS=syntax><SPAN CLASS="regexplain">regex</SPAN></TT>.</p> <p>This first example is actually a perfectly valid regex. It is the most basic pattern, simply matching the literal text <tt class=match>regex</tt>. A “match” is the piece of text, or sequence of bytes or characters that pattern was found to correspond to by the regex processing software. Matches are highlighted in blue on this site.</p> <p><TT CLASS=syntax><SPAN CLASS="regexspecial">\b</SPAN><SPAN CLASS="regexccopen">[</SPAN><SPAN CLASS="regexccrange">A-Z</SPAN><SPAN CLASS="regexccrange">0-9</SPAN><SPAN CLASS="regexccliteral">._%+</SPAN><SPAN CLASS="regexccliteral">-</SPAN><SPAN CLASS="regexccopen">]</SPAN><SPAN CLASS="regexspecial">+</SPAN><SPAN CLASS="regexplain">@</SPAN><SPAN CLASS="regexccopen">[</SPAN><SPAN CLASS="regexccrange">A-Z</SPAN><SPAN CLASS="regexccrange">0-9</SPAN><SPAN CLASS="regexccliteral">.</SPAN><SPAN CLASS="regexccliteral">-</SPAN><SPAN CLASS="regexccopen">]</SPAN><SPAN CLASS="regexspecial">+</SPAN><SPAN CLASS="regexescaped">\.</SPAN><SPAN CLASS="regexccopen">[</SPAN><SPAN CLASS="regexccrange">A-Z</SPAN><SPAN CLASS="regexccopen">]</SPAN><SPAN CLASS="regexspecial">{2,}</SPAN><SPAN CLASS="regexspecial">\b</SPAN></TT> is a more complex pattern. It describes a series of letters, digits, dots, underscores, percentage signs and hyphens, followed by an at sign, followed by another series of letters, digits and hyphens, finally followed by a single dot and two or more letters. In other words: this pattern describes an <A HREF="email.html" TARGET="_top">email address</A>. This also shows the syntax highlighting applied to regular expressions on this site. Word boundaries and quantifiers are blue, character classes are orange, and escaped literals are gray. You’ll see additional colors like green for grouping and purple for meta tokens later in the tutorial.</p> <p>With the above regular expression pattern, you can search through a text file to find email addresses, or verify if a given string looks like an email address. This tutorial uses the term “string” to indicate the text that the regular expression is applied to. This website highlights them in <tt class=string>green</tt>. The term “string” or “character string” is used by programmers to indicate a sequence of characters. In practice, you can use regular expressions with whatever data you can access using the application or programming language you are working with.</p> <a name="engine"></a><h2>Different Regular Expression Engines</h2> <p>A regular expression “engine” is a piece of software that can process regular expressions, trying to match the pattern to the given string. Usually, the engine is part of a larger application and you do not access the engine directly. Rather, the application invokes it for you when needed, making sure the right regular expression is applied to the right file or data.</p> <p>As usual in the software world, different regular expression engines are not fully compatible with each other. The syntax and behavior of a particular engine is called a regular expression flavor. This tutorial covers all the popular regular expression flavors, including <A HREF="perl.html" TARGET="_top">Perl</A>, <A HREF="pcre.html" TARGET="_top">PCRE</A>, <A HREF="php.html" TARGET="_top">PHP</A>, <A HREF="dotnet.html" TARGET="_top">.NET</A>, <A HREF="java.html" TARGET="_top">Java</A>, <A HREF="javascript.html" TARGET="_top">JavaScript</A>, <A HREF="xregexp.html" TARGET="_top">XRegExp</A>, <A HREF="vbscript.html" TARGET="_top">VBScript</A>, <A HREF="python.html" TARGET="_top">Python</A>, <A HREF="ruby.html" TARGET="_top">Ruby</A>, <A HREF="delphi.html" TARGET="_top">Delphi</A>, <A HREF="rlanguage.html" TARGET="_top">R</A>, <A HREF="tcl.html" TARGET="_top">Tcl</A>, <A HREF="posix.html" TARGET="_top">POSIX</A>, and <A HREF="tools.html" TARGET="_top">many others</A>. The tutorial alerts you when these flavors require different syntax or show different behavior. Even if your application is not explicitly covered by the tutorial, it likely uses a regex flavor that is covered, as most applications are developed using one of the programming environments or regex libraries just mentioned.</p> <h2>Give Regexes a First Try</h2> <p>You can easily try the following yourself in a text editor that supports regular expressions, such as <A HREF="editpadpro.html" TARGET="_top">EditPad Pro</A>. If you do not have such an editor, you can <A HREF="https://www.editpadpro.com/download.html" TARGET="_top">download the free evaluation version of EditPad Pro</A> to try this out. EditPad Pro’s regex engine is fully functional in the demo version.</p> <p><img class="light screen" src="screens/eppeditpadpro.png" srcset="screens/eppeditpadpro.png 1x, screens125/eppeditpadpro.png 1.25x, screens150/eppeditpadpro.png 1.5x, screens175/eppeditpadpro.png 1.75x, screens200/eppeditpadpro.png 2x" alt="Searching Using Regular Expressions with EditPad Pro"><img class="dark screen" src="screensdark/eppeditpadpro.png" srcset="screensdark/eppeditpadpro.png 1x, screensdark125/eppeditpadpro.png 1.25x, screensdark150/eppeditpadpro.png 1.5x, screensdark175/eppeditpadpro.png 1.75x, screensdark200/eppeditpadpro.png 2x" alt="Searching Using Regular Expressions with EditPad Pro"></p> <p>As a quick test, copy and paste the text of this page into EditPad Pro. Then select Search|Multiline Search Panel in the menu. In the search panel that appears near the bottom, type in <TT CLASS=syntax><SPAN CLASS="regexplain">regex</SPAN></TT> in the box labeled “Search Text”. Mark the “Regular expression” checkbox, and click the Find First button. This is the leftmost button on the search panel. See how EditPad Pro’s regex engine finds the first match. Click the Find Next button, which sits next to the Find First button, to find further matches. When there are no further matches, the Find Next button’s icon flashes briefly.</p> <p>Now try to search using the regex <TT CLASS=syntax><SPAN CLASS="regexplain">reg</SPAN><SPAN CLASS="regexnest1">(</SPAN><SPAN CLASS="regexplain">ular expression</SPAN><SPAN CLASS="regexplain">s</SPAN><SPAN CLASS="regexspecial">?</SPAN><SPAN CLASS="regexnest1">|</SPAN><SPAN CLASS="regexplain">ex</SPAN><SPAN CLASS="regexnest2">(</SPAN><SPAN CLASS="regexplain">p</SPAN><SPAN CLASS="regexnest2">|</SPAN><SPAN CLASS="regexplain">es</SPAN><SPAN CLASS="regexnest2">)</SPAN><SPAN CLASS="regexspecial">?</SPAN><SPAN CLASS="regexnest1">)</SPAN></TT>. This regex finds all names, singular and plural, I have used on this page to say “regex”. If we only had plain text search, we would have needed 5 searches. With regexes, we need just one search. Regexes save you time when using a tool like EditPad Pro. Select Count Matches in the Search menu to see how many times this regular expression can match the file you have open in EditPad Pro.</p> <p>If you are a programmer, your software will run faster since even a simple regex engine applying the above regex once will outperform a state of the art plain text search algorithm searching through the data five times. Regular expressions also reduce development time. With a regex engine, it takes only one line (e.g. in Perl, PHP, Python, Ruby, Java, or .NET) or a couple of lines (e.g. in C using PCRE) of code to, say, check if the user’s input looks like a <A HREF="email.html" TARGET="_top">valid email address</A>.</p> <p><A HREF="tutorialcnt.html" TARGET="_top">Regex Tutorial Table of Contents</A></p> <div id=cntmobi><p>| <a href='quickstart.html'>Quick Start</a> | <a href='tutorial.html'>Tutorial</a> | <a href='tools.html'>Tools & Languages</a> | <a href='examples.html'>Examples</a> | <a href='refflavors.html'>Reference</a> | <a href='books.html'>Book Reviews</a> |</p><p>| <a href='tutorial.html'>Introduction</a> | <a href='tutorialcnt.html'>Table of Contents</a> | <a href='characters.html'>Special Characters</a> | <a href='nonprint.html'>Non-Printable Characters</a> | <a href='engine.html'>Regex Engine Internals</a> | <a href='charclass.html'>Character Classes</a> | <a href='charclasssubtract.html'>Character Class Subtraction</a> | <a href='charclassintersect.html'>Character Class Intersection</a> | <a href='shorthand.html'>Shorthand Character Classes</a> | <a href='dot.html'>Dot</a> | <a href='anchors.html'>Anchors</a> | <a href='wordboundaries.html'>Word Boundaries</a> | <a href='alternation.html'>Alternation</a> | <a href='optional.html'>Optional Items</a> | <a href='repeat.html'>Repetition</a> | <a href='brackets.html'>Grouping & Capturing</a> | <a href='backref.html'>Backreferences</a> | <a href='backref2.html'>Backreferences, part 2</a> | <a href='named.html'>Named Groups</a> | <a href='backrefrel.html'>Relative Backreferences</a> | <a href='branchreset.html'>Branch Reset Groups</a> | <a href='freespacing.html'>Free-Spacing & Comments</a> | <a href='unicode.html'>Unicode</a> | <a href='modifiers.html'>Mode Modifiers</a> | <a href='atomic.html'>Atomic Grouping</a> | <a href='possessive.html'>Possessive Quantifiers</a> | <a href='lookaround.html'>Lookahead & Lookbehind</a> | <a href='lookaround2.html'>Lookaround, part 2</a> | <a href='keep.html'>Keep Text out of The Match</a> | <a href='conditional.html'>Conditionals</a> | <a href='balancing.html'>Balancing Groups</a> | <a href='recurse.html'>Recursion</a> | <a href='subroutine.html'>Subroutines</a> | <a href='recurseinfinite.html'>Infinite Recursion</a> | <a href='recurserepeat.html'>Recursion & Quantifiers</a> | <a href='recursecapture.html'>Recursion & Capturing</a> | <a href='recursebackref.html'>Recursion & Backreferences</a> | <a href='recursebacktrack.html'>Recursion & Backtracking</a> | <a href='posixbrackets.html'>POSIX Bracket Expressions</a> | <a href='zerolength.html'>Zero-Length Matches</a> | <a href='continue.html'>Continuing Matches</a> |</p></div> <div id=copyright> <P CLASS=copyright>Page URL: <A HREF="https://www.regular-expressions.info/tutorial.html" TARGET="_top">https://www.regular-expressions.info/tutorial.html</A><BR> Page last updated: 19 August 2021<BR> Site last updated: 06 November 2024<BR> Copyright © 2003-2024 Jan Goyvaerts. All rights reserved.</P> </div> </div> </div> </body></html>