CINXE.COM

Regex Tutorial - The Question Mark Makes the Preceding Token Optional

<!DOCTYPE html> <html lang="en"><head><meta charset="utf-8"><link rel=canonical href='https://https://www.regular-expressions.info//optional.html'><title>Regex Tutorial - The Question Mark Makes the Preceding Token Optional</title> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="author" content="Jan Goyvaerts"> <meta name="description" content="In a regular expression, the question mark makes the preceding token optional."> <meta name="keywords" content=""> <link rel=stylesheet href="regex.css" type="text/css"><script src="theme.js" type="text/javascript"></script><link rel="alternate" type="application/rss+xml" title="New at Regular-Expressions.info" href="updates.xml"> </head> <body bgcolor=white text=black> <div id=top></div> <div id=btntop><div id=btngrid><a href="quickstart.html" target="_top"><div>Quick&nbsp;Start</div></a><a href="tutorial.html" target="_top"><div>Tutorial</div></a><a href="tools.html" target="_top"><div>Tools&nbsp;&amp;&nbsp;Languages</div></a><a href="examples.html" target="_top"><div>Examples</div></a><a href="refflavors.html" target="_top"><div>Reference</div></a><a href="books.html" target="_top"><div>Book&nbsp;Reviews</div></a></div></div> <div id=contents><div id=side> <TABLE CLASS=side CELLSPACING=0 CELLPADDING=4><TR><TD CLASS=sideheader>Regex Tutorial</TD></TR><TR><TD><A HREF="tutorial.html" TARGET=_top>Introduction</A></TD></TR><TR><TD><A HREF="tutorialcnt.html" TARGET=_top>Table of Contents</A></TD></TR><TR><TD><A HREF="characters.html" TARGET=_top>Special Characters</A></TD></TR><TR><TD><A HREF="nonprint.html" TARGET=_top>Non-Printable Characters</A></TD></TR><TR><TD><A HREF="engine.html" TARGET=_top>Regex Engine Internals</A></TD></TR><TR><TD><A HREF="charclass.html" TARGET=_top>Character Classes</A></TD></TR><TR><TD><A HREF="charclasssubtract.html" TARGET=_top>Character Class Subtraction</A></TD></TR><TR><TD><A HREF="charclassintersect.html" TARGET=_top>Character Class Intersection</A></TD></TR><TR><TD><A HREF="shorthand.html" TARGET=_top>Shorthand Character Classes</A></TD></TR><TR><TD><A HREF="dot.html" TARGET=_top>Dot</A></TD></TR><TR><TD><A HREF="anchors.html" TARGET=_top>Anchors</A></TD></TR><TR><TD><A HREF="wordboundaries.html" TARGET=_top>Word Boundaries</A></TD></TR><TR><TD><A HREF="alternation.html" TARGET=_top>Alternation</A></TD></TR><TR><TD><A HREF="optional.html" TARGET=_top>Optional Items</A></TD></TR><TR><TD><A HREF="repeat.html" TARGET=_top>Repetition</A></TD></TR><TR><TD><A HREF="brackets.html" TARGET=_top>Grouping &amp; Capturing</A></TD></TR><TR><TD><A HREF="backref.html" TARGET=_top>Backreferences</A></TD></TR><TR><TD><A HREF="backref2.html" TARGET=_top>Backreferences, part 2</A></TD></TR><TR><TD><A HREF="named.html" TARGET=_top>Named Groups</A></TD></TR><TR><TD><A HREF="backrefrel.html" TARGET=_top>Relative Backreferences</A></TD></TR><TR><TD><A HREF="branchreset.html" TARGET=_top>Branch Reset Groups</A></TD></TR><TR><TD><A HREF="freespacing.html" TARGET=_top>Free-Spacing &amp; Comments</A></TD></TR><TR><TD><A HREF="unicode.html" TARGET=_top>Unicode</A></TD></TR><TR><TD><A HREF="modifiers.html" TARGET=_top>Mode Modifiers</A></TD></TR><TR><TD><A HREF="atomic.html" TARGET=_top>Atomic Grouping</A></TD></TR><TR><TD><A HREF="possessive.html" TARGET=_top>Possessive Quantifiers</A></TD></TR><TR><TD><A HREF="lookaround.html" TARGET=_top>Lookahead &amp; Lookbehind</A></TD></TR><TR><TD><A HREF="lookaround2.html" TARGET=_top>Lookaround, part 2</A></TD></TR><TR><TD><A HREF="keep.html" TARGET=_top>Keep Text out of The Match</A></TD></TR><TR><TD><A HREF="conditional.html" TARGET=_top>Conditionals</A></TD></TR><TR><TD><A HREF="balancing.html" TARGET=_top>Balancing Groups</A></TD></TR><TR><TD><A HREF="recurse.html" TARGET=_top>Recursion</A></TD></TR><TR><TD><A HREF="subroutine.html" TARGET=_top>Subroutines</A></TD></TR><TR><TD><A HREF="recurseinfinite.html" TARGET=_top>Infinite Recursion</A></TD></TR><TR><TD><A HREF="recurserepeat.html" TARGET=_top>Recursion &amp; Quantifiers</A></TD></TR><TR><TD><A HREF="recursecapture.html" TARGET=_top>Recursion &amp; Capturing</A></TD></TR><TR><TD><A HREF="recursebackref.html" TARGET=_top>Recursion &amp; Backreferences</A></TD></TR><TR><TD><A HREF="recursebacktrack.html" TARGET=_top>Recursion &amp; Backtracking</A></TD></TR><TR><TD><A HREF="posixbrackets.html" TARGET=_top>POSIX Bracket Expressions</A></TD></TR><TR><TD><A HREF="zerolength.html" TARGET=_top>Zero-Length Matches</A></TD></TR><TR><TD><A HREF="continue.html" TARGET=_top>Continuing Matches</A></TD></TR> </TABLE><TABLE CLASS=side CELLSPACING=0 CELLPADDING=4><TR><TD CLASS=sideheader>More on This Site</TD></TR><TR><TD><A HREF="index.html" TARGET=_top>Introduction</A></TD></TR><TR><TD><A HREF="quickstart.html" TARGET=_top>Regular Expressions Quick Start</A></TD></TR><TR><TD><A HREF="tutorial.html" TARGET=_top>Regular Expressions Tutorial</A></TD></TR><TR><TD><A HREF="replacetutorial.html" TARGET=_top>Replacement Strings Tutorial</A></TD></TR><TR><TD><A HREF="tools.html" TARGET=_top>Applications and Languages</A></TD></TR><TR><TD><A HREF="examples.html" TARGET=_top>Regular Expressions Examples</A></TD></TR><TR><TD><A HREF="refflavors.html" TARGET=_top>Regular Expressions Reference</A></TD></TR><TR><TD><A HREF="refreplace.html" TARGET=_top>Replacement Strings Reference</A></TD></TR><TR><TD><A HREF="books.html" TARGET=_top>Book Reviews</A></TD></TR><TR><TD><A HREF="print.html" TARGET=_top>Printable PDF</A></TD></TR><TR><TD><A HREF="about.html" TARGET=_top>About This Site</A></TD></TR><TR><TD><A HREF="updates.html" TARGET=_top>RSS Feed &amp; Blog</A></TD></TR></TABLE></DIV><div class=bodytext><div class=topad style="height:130px"><A HREF="https://www.regexbuddy.com/create.html" TARGET="_top"><picture><source media="(max-width: 370px)" srcset="ads/320/rxbtutorial100.png 1x, ads/320/rxbtutorial150.png 1.5x, ads/320/rxbtutorial200.png 2x, ads/320/rxbtutorial250.png 2.5x, ads/320/rxbtutorial300.png 3x, ads/320/rxbtutorial350.png 3.5x, ads/320/rxbtutorial400.png 4x"><source media="(max-width: 500px)" srcset="ads/360/rxbtutorial100.png 1x, ads/360/rxbtutorial150.png 1.5x, ads/360/rxbtutorial200.png 2x, ads/360/rxbtutorial250.png 2.5x, ads/360/rxbtutorial300.png 3x, ads/360/rxbtutorial350.png 3.5x, ads/360/rxbtutorial400.png 4x"><source media="(max-width: 660px)" srcset="ads/480/rxbtutorial100.png 1x, ads/480/rxbtutorial150.png 1.5x, ads/480/rxbtutorial200.png 2x, ads/480/rxbtutorial250.png 2.5x, ads/480/rxbtutorial300.png 3x, ads/480/rxbtutorial350.png 3.5x, ads/480/rxbtutorial400.png 4x"><source media="(max-width: 747px)" srcset="ads/640/rxbtutorial100.png 1x, ads/640/rxbtutorial150.png 1.5x, ads/640/rxbtutorial200.png 2x, ads/640/rxbtutorial250.png 2.5x, ads/640/rxbtutorial300.png 3x, ads/640/rxbtutorial350.png 3.5x, ads/640/rxbtutorial400.png 4x"><img src="ads/728/rxbtutorial100.png" srcset="ads/728/rxbtutorial100.png 1x, ads/728/rxbtutorial125.png 1.25x, ads/728/rxbtutorial150.png 1.5x, ads/728/rxbtutorial175.png 1.75x, ads/728/rxbtutorial200.png 2x, ads/728/rxbtutorial250.png 2.5x, ads/728/rxbtutorial300.png 3x, ads/728/rxbtutorial350.png 3.5x, ads/728/rxbtutorial400.png 4x" alt="RegexBuddy鈥擝etter than a regular expression tutorial!"></picture></A></div> <div class=bulb><h1>Optional Items</h1><script type="text/javascript">showbulb();</script></div> <p>The question mark makes the preceding token in the regular expression optional. <TT CLASS=syntax><SPAN CLASS="regexplain">colo</SPAN><SPAN CLASS="regexplain">u</SPAN><SPAN CLASS="regexspecial">?</SPAN><SPAN CLASS="regexplain">r</SPAN></TT> matches both <tt class=match>colour</tt> and <tt class=match>color</tt>. The question mark is called a quantifier.</p> <p>You can make several tokens optional by grouping them together using parentheses, and placing the question mark after the closing parenthesis. E.g.:&nbsp;<TT CLASS=syntax><SPAN CLASS="regexplain">Nov</SPAN><SPAN CLASS="regexnest1">(</SPAN><SPAN CLASS="regexplain">ember</SPAN><SPAN CLASS="regexnest1">)</SPAN><SPAN CLASS="regexspecial">?</SPAN></TT> matches <tt class=match>Nov</tt> and <tt class=match>November</tt>.</p> <p>You can write a regular expression that matches many alternatives by including more than one question mark. <TT CLASS=syntax><SPAN CLASS="regexplain">Feb</SPAN><SPAN CLASS="regexnest1">(</SPAN><SPAN CLASS="regexplain">ruary</SPAN><SPAN CLASS="regexnest1">)</SPAN><SPAN CLASS="regexspecial">?</SPAN><SPAN CLASS="regexplain">聽23</SPAN><SPAN CLASS="regexnest1">(</SPAN><SPAN CLASS="regexplain">rd</SPAN><SPAN CLASS="regexnest1">)</SPAN><SPAN CLASS="regexspecial">?</SPAN></TT> matches <tt class=match>February 23rd</tt>, <tt class=match>February 23</tt>, <tt class=match>Feb 23rd</tt> and <tt class=match>Feb 23</tt>.</p> <p>You can also use curly braces to make something optional. <TT CLASS=syntax><SPAN CLASS="regexplain">colo</SPAN><SPAN CLASS="regexplain">u</SPAN><SPAN CLASS="regexspecial">{0,1}</SPAN><SPAN CLASS="regexplain">r</SPAN></TT> is the same as <TT CLASS=syntax><SPAN CLASS="regexplain">colo</SPAN><SPAN CLASS="regexplain">u</SPAN><SPAN CLASS="regexspecial">?</SPAN><SPAN CLASS="regexplain">r</SPAN></TT>. <A HREF="posix.html" TARGET="_top">POSIX BRE</A> and <A HREF="posix.html" TARGET="_top">GNU BRE</A> do not support either syntax. These flavors require backslashes to <em>give</em> curly braces their special meaning: <TT CLASS=syntax><SPAN CLASS="regexplain">colo</SPAN><SPAN CLASS="regexplain">u</SPAN><SPAN CLASS="regexspecial">\{0,1\}</SPAN><SPAN CLASS="regexplain">r</SPAN></TT>.</p> <a name="greedy"></a><h2>Important Regex Concept: Greediness</h2> <p>The question mark is the first metacharacter introduced by this tutorial that is <i>greedy</i>. The question mark gives the regex engine two choices: try to match the part the question mark applies to, or do not try to match it. The engine always tries to match that part. Only if this causes the entire regular expression to fail, will the engine try ignoring the part the question mark applies to.</p> <p>The effect is that if you apply the regex <TT CLASS=syntax><SPAN CLASS="regexplain">Feb聽23</SPAN><SPAN CLASS="regexnest1">(</SPAN><SPAN CLASS="regexplain">rd</SPAN><SPAN CLASS="regexnest1">)</SPAN><SPAN CLASS="regexspecial">?</SPAN></TT> to the string <tt class=string>Today is Feb 23rd, 2003</tt>, the match is always <tt class=match>Feb 23rd</tt> and not <tt class=match>Feb 23</tt>. You can make the question mark <i>lazy</i> (i.e. turn off the greediness) by putting a second question mark after the first.</p> <p>The discussion about the other <A HREF="repeat.html" TARGET="_top">repetition</A> operators has more details on greedy and lazy quantifiers.</p> <h2>Looking Inside The Regex Engine</h2> <p>Let鈥檚 apply the regular expression <TT CLASS=syntax><SPAN CLASS="regexplain">colo</SPAN><SPAN CLASS="regexplain">u</SPAN><SPAN CLASS="regexspecial">?</SPAN><SPAN CLASS="regexplain">r</SPAN></TT> to the string <tt class=string>The colonel likes the color green</tt>.</p> <p>The first token in the regex is the <A HREF="characters.html" TARGET="_top">literal</A> <TT CLASS=syntax><SPAN CLASS="regexplain">c</SPAN></TT>. The first position where it matches successfully is the <tt class=match>c</tt> in <tt class=string>colonel</tt>. The engine continues, and finds that <TT CLASS=syntax><SPAN CLASS="regexplain">o</SPAN></TT> matches <tt class=match>o</tt>, <TT CLASS=syntax><SPAN CLASS="regexplain">l</SPAN></TT> matches <tt class=match>l</tt> and another <TT CLASS=syntax><SPAN CLASS="regexplain">o</SPAN></TT> matches <tt class=match>o</tt>. Then the engine checks whether <TT CLASS=syntax><SPAN CLASS="regexplain">u</SPAN></TT> matches <tt class=string>n</tt>. This fails. However, the question mark tells the regex engine that failing to match <TT CLASS=syntax><SPAN CLASS="regexplain">u</SPAN></TT> is acceptable. Therefore, the engine skips ahead to the next regex token: <TT CLASS=syntax><SPAN CLASS="regexplain">r</SPAN></TT>. But this fails to match <tt class=string>n</tt> as well. Now, the engine can only conclude that the entire regular expression cannot be matched starting at the <tt class=match>c</tt> in <tt class=string>colonel</tt>. Therefore, the engine starts again trying to match <TT CLASS=syntax><SPAN CLASS="regexplain">c</SPAN></TT> to the first o in <tt class=string>colonel</tt>.</p> <p>After a series of failures, <TT CLASS=syntax><SPAN CLASS="regexplain">c</SPAN></TT> matches the <tt class=match>c</tt> in <tt class=string>color</tt>, and <TT CLASS=syntax><SPAN CLASS="regexplain">o</SPAN></TT>, <TT CLASS=syntax><SPAN CLASS="regexplain">l</SPAN></TT> and <TT CLASS=syntax><SPAN CLASS="regexplain">o</SPAN></TT> match the following characters. Now the engine checks whether <TT CLASS=syntax><SPAN CLASS="regexplain">u</SPAN></TT> matches <tt class=string>r</tt>. This fails. Again: no problem. The question mark allows the engine to continue with <TT CLASS=syntax><SPAN CLASS="regexplain">r</SPAN></TT>. This matches <tt class=match>r</tt> and the engine reports that the regex successfully matched <tt class=match>color</tt> in our string.</p><div id=cntmobi><p>|&ensp;<a href='quickstart.html'>Quick&nbsp;Start</a>&ensp;|&ensp;<a href='tutorial.html'>Tutorial</a>&ensp;|&ensp;<a href='tools.html'>Tools&nbsp;&amp;&nbsp;Languages</a>&ensp;|&ensp;<a href='examples.html'>Examples</a>&ensp;|&ensp;<a href='refflavors.html'>Reference</a>&ensp;|&ensp;<a href='books.html'>Book&nbsp;Reviews</a>&ensp;|</p><p>|&ensp;<a href='tutorial.html'>Introduction</a>&ensp;|&ensp;<a href='tutorialcnt.html'>Table of Contents</a>&ensp;|&ensp;<a href='characters.html'>Special Characters</a>&ensp;|&ensp;<a href='nonprint.html'>Non-Printable Characters</a>&ensp;|&ensp;<a href='engine.html'>Regex Engine Internals</a>&ensp;|&ensp;<a href='charclass.html'>Character Classes</a>&ensp;|&ensp;<a href='charclasssubtract.html'>Character Class Subtraction</a>&ensp;|&ensp;<a href='charclassintersect.html'>Character Class Intersection</a>&ensp;|&ensp;<a href='shorthand.html'>Shorthand Character Classes</a>&ensp;|&ensp;<a href='dot.html'>Dot</a>&ensp;|&ensp;<a href='anchors.html'>Anchors</a>&ensp;|&ensp;<a href='wordboundaries.html'>Word Boundaries</a>&ensp;|&ensp;<a href='alternation.html'>Alternation</a>&ensp;|&ensp;<a href='optional.html'>Optional Items</a>&ensp;|&ensp;<a href='repeat.html'>Repetition</a>&ensp;|&ensp;<a href='brackets.html'>Grouping &amp; Capturing</a>&ensp;|&ensp;<a href='backref.html'>Backreferences</a>&ensp;|&ensp;<a href='backref2.html'>Backreferences, part 2</a>&ensp;|&ensp;<a href='named.html'>Named Groups</a>&ensp;|&ensp;<a href='backrefrel.html'>Relative Backreferences</a>&ensp;|&ensp;<a href='branchreset.html'>Branch Reset Groups</a>&ensp;|&ensp;<a href='freespacing.html'>Free-Spacing &amp; Comments</a>&ensp;|&ensp;<a href='unicode.html'>Unicode</a>&ensp;|&ensp;<a href='modifiers.html'>Mode Modifiers</a>&ensp;|&ensp;<a href='atomic.html'>Atomic Grouping</a>&ensp;|&ensp;<a href='possessive.html'>Possessive Quantifiers</a>&ensp;|&ensp;<a href='lookaround.html'>Lookahead &amp; Lookbehind</a>&ensp;|&ensp;<a href='lookaround2.html'>Lookaround, part 2</a>&ensp;|&ensp;<a href='keep.html'>Keep Text out of The Match</a>&ensp;|&ensp;<a href='conditional.html'>Conditionals</a>&ensp;|&ensp;<a href='balancing.html'>Balancing Groups</a>&ensp;|&ensp;<a href='recurse.html'>Recursion</a>&ensp;|&ensp;<a href='subroutine.html'>Subroutines</a>&ensp;|&ensp;<a href='recurseinfinite.html'>Infinite Recursion</a>&ensp;|&ensp;<a href='recurserepeat.html'>Recursion &amp; Quantifiers</a>&ensp;|&ensp;<a href='recursecapture.html'>Recursion &amp; Capturing</a>&ensp;|&ensp;<a href='recursebackref.html'>Recursion &amp; Backreferences</a>&ensp;|&ensp;<a href='recursebacktrack.html'>Recursion &amp; Backtracking</a>&ensp;|&ensp;<a href='posixbrackets.html'>POSIX Bracket Expressions</a>&ensp;|&ensp;<a href='zerolength.html'>Zero-Length Matches</a>&ensp;|&ensp;<a href='continue.html'>Continuing Matches</a>&ensp;|</p></div> <div id=copyright> <P CLASS=copyright>Page URL: <A HREF="https://www.regular-expressions.info/optional.html" TARGET="_top">https://www.regular-expressions.info/optional.html</A><BR> Page last updated: 22 November 2019<BR> Site last updated: 06 November 2024<BR> Copyright &copy; 2003-2024 Jan Goyvaerts. All rights reserved.</P> </div> </div> </div> </body></html>

Pages: 1 2 3 4 5 6 7 8 9 10