CINXE.COM

Issue 47248: Possible slowdown of regex searching in 3.11 - Python tracker

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Issue 47248: Possible slowdown of regex searching in 3.11 - Python tracker </title> <link rel="shortcut icon" href="@@file/favicon.ico" /> <link rel="stylesheet" type="text/css" href="@@file/main.css" /> <link rel="stylesheet" type="text/css" href="@@file/style.css" /> <link rel="search" type="application/opensearchdescription+xml" href="@@file/osd.xml" title="Python bug tracker search" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <script nonce="3285991524c783dbade16157266f38e4151e69b7564f2096eec24edb8bfc3caa" type="text/javascript"> submitted = false; function submit_once() { if (submitted) { alert("Your request is being processed.\nPlease be patient."); return false; } submitted = true; return true; } function help_window(helpurl, width, height) { HelpWin = window.open('https://bugs.python.org/' + helpurl, 'RoundupHelpWindow', 'scrollbars=yes,resizable=yes,toolbar=no,height='+height+',width='+width); HelpWin.focus () } </script> <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.min.js"></script> <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jqueryui/1.8.15/jquery-ui.js"></script> <script type="text/javascript" src="@@file/issue.item.js"></script> <link rel="stylesheet" type="text/css" href="https://ajax.googleapis.com/ajax/libs/jqueryui/1.8/themes/smoothness/jquery-ui.css" /> </head> <body> <!-- Logo --> <h1 id="logoheader"> <a accesskey="1" href="." id="logolink"> <img src="@@file/python-logo.gif" alt="homepage" border="0" id="logo" /></a> </h1> <div id="utility-menu"> <!-- Search Box --> <div id="searchbox"> <form name="searchform" method="get" action="issue" id="searchform"> <div id="search"> <input type="hidden" name="@columns" value="id,github,activity,title,creator,assignee,status,type" /> <input type="hidden" name="@sort" value="-activity" /> <input type="hidden" name="@filter" value="status" /> <input type="hidden" name="@action" value="searchid" /> <input type="hidden" name="ignore" value="file:content" /> <input class="input-text" id="search-text" name="@search_text" size="10" /> <input type="submit" id="submit" value="search" name="submit" class="input-button" /> <input type="radio" name="status" id="status_notresolved" value="-1,1,3" /> <label for="status_notresolved">open</label> <input type="radio" name="status" checked="checked" id="status_all" value="-1,1,2,3" /> <label for="status_all">all</label> </div> </form> </div> </div> <div id="left-hand-navigation"> <!-- Main Menu NEED LEVEL TWO HEADER AND FOOTER --> <div id="menu"> <ul class="level-one"> <li><a href="https://www.python.org/" title="Go to the Python homepage">Python Home</a></li> <li><a href="https://www.python.org/about/" title="About The Python Language">About</a></li> <li><a href="https://www.python.org/blogs/" title="">News</a></li> <li><a href="https://www.python.org/doc/" title="">Documentation</a></li> <li><a href="https://www.python.org/downloads/" title="">Downloads</a></li> <li><a href="https://www.python.org/community/" title="">Community</a></li> <li><a href="https://www.python.org/psf/" title="Python Software Foundation">Foundation</a></li> <li><a href="https://devguide.python.org/" title="Python Developer's Guide">Developer's Guide</a></li> <li class="selected"><a href="." class="selected" title="Python Issue Tracker">Issue Tracker</a> <ul class="level-two"> <li> <strong>Issues</strong> <ul class="level-three"> <li><a href="issue?@template=search&amp;status=1">Search</a></li> <li><a href="issue?@action=random">Random Issue</a></li> <li> <form method="post" action="issue47248"> <input type="submit" class="form-small" value="Show issue:" /> <input class="form-small" size="4" type="text" name="@number" /> <input type="hidden" name="@type" value="issue" /> <input type="hidden" name="@action" value="show" /> </form> </li> </ul> </li> <li> <strong>Summaries</strong> <ul class="level-three"> <li> <a href="issue?status=1&amp;@sort=-activity&amp;@columns=id%2Cgithub%2Cactivity%2Ctitle%2Ccreator%2Cstatus&amp;@dispname=Issues%20with%20patch&amp;@startwith=0&amp;@group=priority&amp;keywords=2&amp;@action=search&amp;@filter=&amp;@pagesize=50">Issues with patch</a> </li> <li> <a href="issue?status=1&amp;@sort=-activity&amp;@columns=id%2Cgithub%2Cactivity%2Ctitle%2Ccreator%2Cstatus&amp;@dispname=Easy%20issues&amp;@startwith=0&amp;@group=priority&amp;keywords=6&amp;@action=search&amp;@filter=&amp;@pagesize=50">Easy issues</a> </li> <li> <a href="issue?@template=stats">Stats</a> </li> </ul> </li> <li> <strong>User</strong> <form method="post" action="issue47248"> <ul class="level-three"> <li> Login<br /> <input size="10" name="openid_identifier" style="" /><br /> <input size="10" type="password" name="__login_password" /><br /> <input type="hidden" name="@action" value="Login" /> <input type="checkbox" name="remember" id="remember" /> <label for="remember">Remember me?</label><br /> <input class="form-small" type="submit" value="Login" /><br /> <input type="hidden" name="__came_from" value="https://bugs.python.org/issue47248?"> <input type="hidden" name="@sort" value=""/> <input type="hidden" name="@group" value=""/> <input type="hidden" name="@pagesize" value="50"/> <input type="hidden" name="@startwith" value="0"/> </li> <li> </li> <li><a href="user?@template=forgotten">Lost&nbsp;your&nbsp;login?</a></li> </ul> </form> </li> <li> <strong>Administration</strong> <ul class="level-three"> <li> <a href="user?@sort=username">User List</a></li> <li> <a href="user?iscommitter=1&amp;@action=search&amp;@sort=username&amp;@pagesize=300">Committer List</a></li> </ul> </li> <li> <strong>Help</strong> <ul class="level-three"> <li><a href="http://docs.python.org/devguide/triaging.html"> Tracker Documentation</a></li> <li><a href="http://wiki.python.org/moin/TrackerDevelopment"> Tracker Development</a></li> <li><a href="https://github.com/python/psf-infra-meta/issues"> Report Tracker Problem</a></li> </ul> </li> </ul> </li> </ul> </div> <!-- menu --> </div> <!-- left-hand-navigation --> <div id="content-body"> <div id="body-main"> <div id="content"> <div id="breadcrumb"> Issue47248 </div> <div id="migration-notice"> <div id="migration-images"> <img width="32" src="@@file/python-logo-small.png" /> ➜ <a href="https://github.com/python/cpython/issues"><img width="32" src="@@file/gh-icon.png" /></a> </div> <p>This issue tracker <b>has been migrated to <a href="https://github.com/python/cpython/issues">GitHub</a></b>, and is currently <b>read-only</b>.<br /> For more information, <a title="GitHub FAQs" href="https://devguide.python.org/gh-faq/"> see the GitHub FAQs in the Python's Developer Guide.</a></p> </div> <div> <form method="post" name="itemSynopsis" onsubmit="return submit_once()" enctype="multipart/form-data" action="issue47248"> <div id="gh-issue-link"> <a href="https://github.com/python/cpython/issues/91404"> <img width="32" src="@@file/gh-icon.png" /> <p> <span>This issue has been migrated to GitHub:</span> https://github.com/python/cpython/issues/91404 </p> </a> </div> <fieldset><legend>classification</legend> <table class="form"> <tr> <th class="required"><a href="http://docs.python.org/devguide/triaging.html#title" target="_blank">Title</a>:</th> <td colspan="3"> <span>Possible slowdown of regex searching in 3.11</span> <input type="hidden" name="title" value="Possible slowdown of regex searching in 3.11"> </td> </tr> <tr> <th class="required"><a href="http://docs.python.org/devguide/triaging.html#type" target="_blank">Type</a>:</th> <td>performance</td> <th><a href="http://docs.python.org/devguide/triaging.html#stage" target="_blank">Stage</a>:</th> <td></td> </tr> <tr> <th><a href="http://docs.python.org/devguide/triaging.html#components" target="_blank">Components</a>:</th> <td></td> <th><a href="http://docs.python.org/devguide/triaging.html#versions" target="_blank">Versions</a>:</th> <td>Python 3.11</td> </tr> </table> </fieldset> <fieldset><legend>process</legend> <table class="form"> <tr> <th><a href="http://docs.python.org/devguide/triaging.html#status" target="_blank">Status</a>:</th> <td>open</td> <th><a href="http://docs.python.org/devguide/triaging.html#resolution" target="_blank">Resolution</a>:</th> <td></td> </tr> <tr> <th> <a href="http://docs.python.org/devguide/triaging.html#dependencies" target="_blank">Dependencies</a>: </th> <td> </td> <th><a href="http://docs.python.org/devguide/triaging.html#superseder" target="_blank">Superseder</a>:</th> <td> </td> </tr> <tr> <th> <a href="http://docs.python.org/devguide/triaging.html#assigned-to" target="_blank">Assigned To</a>: </th> <td> </td> <th> <a href="http://docs.python.org/devguide/triaging.html#nosy-list" target="_blank">Nosy List</a><!-- <span tal:condition="context/nosy_count" tal:replace="python: ' (%d)' % context.nosy_count" /> -->: </th> <td> Dennis Sweeney, Mark.Shannon, malin, serhiy.storchaka </td> </tr> <tr> <th> <a href="http://docs.python.org/devguide/triaging.html#priority" target="_blank">Priority</a>: </th> <td>normal</td> <th> <a href="http://docs.python.org/devguide/triaging.html#keywords" target="_blank">Keywords</a>: </th> <td>3.11regression</td> </tr> </table> </fieldset> </form> <p>Created on <strong>2022-04-07 11:30</strong> by <strong>Mark.Shannon</strong>, last changed <strong>2022-04-11 14:59</strong> by <strong>admin</strong>.</p> <table class="messages"> <tr><th colspan="4" class="header">Messages (4)</th></tr> <tr> <th> <a href="#msg416923" id="msg416923">msg416923</a> - <a href="msg416923">(view)</a></th> <th>Author: Mark Shannon (Mark.Shannon) <span title="Contributor form received">*</span> <img src="@@file/committer.png" title="Python committer" alt="(Python committer)" /></th> <th>Date: 2022-04-07 11:30</th> </tr> <tr> <td colspan="4" class="content"> <pre>The 3 regular expression benchmarks in the pyperformance suite, regex_v8, regex_effbot and regex_dna show slowdowns between 3% and 10%. Looking at the stats, nothing seems wrong with specialization or the memory optimizations. Which strongly suggests a regression in the sre module itself, but I can't say so for certain.</pre> </td> </tr> <tr> <th> <a href="#msg416928" id="msg416928">msg416928</a> - <a href="msg416928">(view)</a></th> <th>Author: Ma Lin (malin) <span title="Contributor form received">*</span></th> <th>Date: 2022-04-07 14:21</th> </tr> <tr> <td colspan="4" class="content"> <pre>Could you give the two versions? I will do a git bisect. I tested 356997c~1 and 356997c [1], msvc2022 non-pgo release build: # regex_dna ### an +- std dev: 151 ms +- 1 ms -&gt; 152 ms +- 1 ms: 1.01x slower t significant # regex_effbot ### an +- std dev: 2.47 ms +- 0.01 ms -&gt; 2.46 ms +- 0.02 ms: 1.00x faster t significant # regex_v8 ### an +- std dev: 21.7 ms +- 0.1 ms -&gt; 22.4 ms +- 0.1 ms: 1.03x slower gnificant (t=-30.82) <a href="https://github.com/python/cpython/commit/356997cccc21a3391175d20e9ef03d434675b496">https://github.com/python/cpython/commit/356997cccc21a3391175d20e9ef03d434675b496</a></pre> </td> </tr> <tr> <th> <a href="#msg416959" id="msg416959">msg416959</a> - <a href="msg416959">(view)</a></th> <th>Author: Dennis Sweeney (Dennis Sweeney) <span title="Contributor form received">*</span> <img src="@@file/committer.png" title="Python committer" alt="(Python committer)" /></th> <th>Date: 2022-04-08 06:04</th> </tr> <tr> <td colspan="4" class="content"> <pre>Possibly related to the new atomic grouping support from <a href="https://github.com/python/cpython/pull/31982" class="closed" title="GitHub PR 31982: [merged] bpo-433030: Add support of atomic grouping in regular expressions">GH-31982</a>?</pre> </td> </tr> <tr> <th> <a href="#msg416961" id="msg416961">msg416961</a> - <a href="msg416961">(view)</a></th> <th>Author: Ma Lin (malin) <span title="Contributor form received">*</span></th> <th>Date: 2022-04-08 07:22</th> </tr> <tr> <td colspan="4" class="content"> <pre>&gt; Possibly related to the new atomic grouping support from <a href="https://github.com/python/cpython/pull/31982" class="closed" title="GitHub PR 31982: [merged] bpo-433030: Add support of atomic grouping in regular expressions">GH-31982</a>? It seems not likely. I will do some benchmarks for this issue, more information (version/platform) is welcome.</pre> </td> </tr> </table> <table class="history table table-condensed table-striped"><tr><th colspan="4" class="header"> History </th></tr><tr> <th>Date</th> <th>User</th> <th>Action</th> <th>Args</th> </tr> <tr><td>2022-04-11&nbsp;14:59:58</td><td>admin</td><td>set</td><td>github: 91404</td></tr> <tr><td>2022-04-08&nbsp;07:22:53</td><td>malin</td><td>set</td><td>messages: + <a rel="nofollow" href="msg416961">msg416961</a></td></tr> <tr><td>2022-04-08&nbsp;06:04:50</td><td>Dennis Sweeney</td><td>set</td><td>nosy: + <a rel="nofollow" href="user31377">Dennis Sweeney</a>, <a rel="nofollow" href="user15623">serhiy.storchaka</a><br />messages: + <a rel="nofollow" href="msg416959">msg416959</a><br /></td></tr> <tr><td>2022-04-07&nbsp;14:21:03</td><td>malin</td><td>set</td><td>nosy: + <a rel="nofollow" href="user19314">malin</a><br />messages: + <a rel="nofollow" href="msg416928">msg416928</a><br /></td></tr> <tr><td>2022-04-07&nbsp;11:30:33</td><td>Mark.Shannon</td><td>create</td><td></td></tr> </table> </div> </div> <!-- content-body --> <div id="footer"> <div id="credits"> Supported by <a href="https://python.org/psf-landing/" title="The Python Software Foundation">The Python Software Foundation</a>, <br> Powered by <a href="http://roundup.sourceforge.net" title="Powered by the Roundup Issue Tracker">Roundup</a> </div> <!-- credits --> Copyright &copy; 1990-2022, <a href="http://python.org/psf">Python Software Foundation</a><br /> <a href="http://python.org/about/legal">Legal Statements</a> </div> <!-- footer --> </div> <!-- body-main --> </div> <!-- content --> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10