CINXE.COM
Apache UIMA - Apache UIMA Ruta™
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "https://www.w3.org/TR/html4/loose.dtd"> <!-- ====================================================================== --> <!-- GENERATED FILE, DO NOT EDIT, EDIT THE XML FILE IN xdocs INSTEAD! --> <!-- ====================================================================== --> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/> <style type="text/css">@import "stylesheets/base.css";</style> <meta name="author" value="Apache UIMA Documentation Team"> <meta name="email" value="dev@uima.apache.org"> <title>Apache UIMA - Apache UIMA Ruta™</title> </head> <body> <div class="topLogos"> <table border="0" width="100%" cellspacing="0"> <!-- TOP IMAGE --> <tr> <td align='LEFT'> <a href="index.html"> <img style="border: 1px solid black;" src="./images/UIMA_banner2tlpTm.png" alt="UIMA project logo" border="0"/> </a> </td> <td align='CENTER'> <div class="pageBanner">Apache UIMA Ruta™</div> </td> <td align='RIGHT'> <a href="https://www.apache.org"> <img src="./images/asf-logo-on-white-smallTm.png" alt="Apache UIMA" border="0"/> </a> </td> </tr> </table> <hr noshade="" size="1"/> </div> <table border="0" width="100%" cellspacing="4"> <tr> <td align='RIGHT' colspan="2"> <form method="get" action="https://www.google.com/search"> Search the site <input type="text" name="q" size="25" maxlength="255" value="" /> <input type="hidden" name="sitesearch" value="https://uima.apache.org/" /> <input name="Search" value="Search Site" type="submit"/> </form> </td> </tr> <tr> <!-- LEFT SIDE NAVIGATION --> <td width="20%" valign="top"> <!-- regular menu --> <div class="navBar"> <br/> <div class="navBarItem"> <div class="navPartHeading">General</div> </div> <div class="navBar"> <div class="navBarItem"> <a href="./index.html">Home</a> </div> <div class="navBarItem"> <a href="./downloads.cgi">Downloads</a> </div> <div class="navBarItem"> <a href="./documentation.html">Documentation</a> </div> <div class="navBarItem"> <a href="./news.html">News</a> </div> <div class="navBarItem"> <a href="./publications.html">Publications</a> </div> <br style="line-height: .5em"/> <div class="navBarItem"> <a href="https://issues.apache.org/jira/browse/uima" target="_blank" rel="noopener">Issue tracker <img src="images/offsitelink.png"/></a> </div> <div class="navBarItem"> <a href="https://cwiki.apache.org/confluence/display/UIMA/" target="_blank" rel="noopener">Wiki <img src="images/offsitelink.png"/></a> </div> <br style="line-height: .5em"/> <div class="navBarItem"> <a href="https://cwiki.apache.org/confluence/display/UIMA/Powered+by+Apache+UIMA" target="_blank" rel="noopener">Powered By UIMA <img src="images/offsitelink.png"/></a> </div> </div> <br/> <div class="navBarItem"> <div class="navPartHeading">Community</div> </div> <div class="navBar"> <div class="navBarItem"> <a href="./get-involved.html">Get Involved</a> </div> <div class="navBarItem"> <a href="./mail-lists.html">Mailing Lists</a> </div> <div class="navBarItem"> <a href="./contribution-policy.html">Contribution Policies</a> </div> <div class="navBarItem"> <a href="./faq.html">FAQ</a> </div> <div class="navBarItem"> <a href="./project-guidelines.html">Project Guidelines</a> </div> </div> <br/> <div class="navBarItem"> <div class="navPartHeading">Components & Tools</div> </div> <div class="navBar"> <div class="navBarItem"> <a href="./sandbox.html#uima-addons-annotators">Annotators</a> </div> <div class="navBarItem"> <a href="./toolsServers.html">Tools & Servers</a> </div> <div class="navBarItem"> <a href="./sandbox.html">Addons and Sandbox</a> </div> <div class="navBarItem"> <a href="./ruta.html">UIMA Ruta</a> </div> <div class="navBarItem"> <a href="https://github.com/apache/uima-uimafit" target="_blank" rel="noopener">uimaFIT <img src="images/offsitelink.png"/></a> </div> <div class="navBarItem"> <a href="./external-resources.html">External Resources</a> </div> </div> <br/> <div class="navBarItem"> <div class="navPartHeading">Development</div> </div> <div class="navBar"> <div class="navBarItem"> <a href="./dev-quick.html">Quick Start: building</a> </div> <div class="navBarItem"> <a href="./building-uima.html">Building from Source</a> </div> <div class="navBarItem"> <a href="./one-time-setup.html">One-time setups</a> </div> <div class="navBarItem"> <a href="./svn.html">Source Code</a> </div> <div class="navBarItem"> <a href="./release.html">Doing a UIMA release</a> </div> <div class="navBarItem"> <a href="https://www.apache.org/security/committers.html" target="_blank" rel="noopener">Doing a CVE (Apache) <img src="images/offsitelink.png"/></a> </div> <div class="navBarItem"> <a href="./eclipse-update-site.html">Eclipse Update Sites</a> </div> <div class="navBarItem"> <a href="./git.html">GIT</a> </div> <div class="navBarItem"> <a href="./codeConventions.html">Code Conventions</a> </div> <div class="navBarItem"> <a href="./uima-specification.html">UIMA Specification (OASIS)</a> </div> <div class="navBarItem"> <a href="./team-list.html">Project Team</a> </div> <div class="navBarItem"> <a href="./maven-design.html">Maven Use</a> </div> <div class="navBarItem"> <a href="./updating-website.html">Updating this Website</a> </div> </div> <br/> <div class="navBarItem"> <div class="navPartHeading">Events and Conferences</div> </div> <div class="navBar"> <div class="navBarItem"> <a href="./coling14.html">COLING 2014</a> </div> <div class="navBarItem"> <a href="./gscl13.html">GSCL 2013</a> </div> <div class="navBarItem"> <a href="./iks09.html">IKS 2009</a> </div> <div class="navBarItem"> <a href="./gscl09.html">GSCL 2009</a> </div> <div class="navBarItem"> <a href="./lsm09.html">LSM 2009</a> </div> <div class="navBarItem"> <a href="./lrec08.html">LREC 2008</a> </div> <div class="navBarItem"> <a href="./gldv07.html">GLDV 2007</a> </div> </div> <br/> <div class="navBarItem"> <div class="navPartHeading">ASF</div> </div> <div class="navBar"> <div class="navBarItem"> <a href="https://www.apache.org/licenses/" target="_blank" rel="noopener">License <img src="images/offsitelink.png"/></a> </div> <div class="navBarItem"> <a href="https://www.apache.org/foundation/thanks.html" target="_blank" rel="noopener">ASF Sponsors <img src="images/offsitelink.png"/></a> </div> <div class="navBarItem"> <a href="https://www.apache.org/foundation/sponsorship.html" target="_blank" rel="noopener">ASF Sponsorship <img src="images/offsitelink.png"/></a> </div> <div class="navBarItem"> <a href="./security_report">Security</a> </div> </div> </div> </td> <td width="80%" align="left" valign="top"> <div class="sectionTable"> <table class="sectionTable"> <tr><td> <a name="Apache UIMA Ruta (Rule-based Text Annotation)"><h1><img src="images/UIMA_4sq50tightCropSolid.png"/> Apache UIMA Ruta (Rule-based Text Annotation)</h1></a> </td></tr> <tr><td> <blockquote class="sectionBody"> <ul> <li><a href='#Overview'> Overview </a></li> <li><a href='#Rule Language'> Rule Language </a></li> <li><a href='#Workbench'> Workbench </a></li> <li><a href='#Documentation'> Documentation </a></li> <li><a href='#Developer Information'> Developer Information </a></li> <li><a href='#Reference'> Reference </a></li> </ul> <table class="subsectionTable" id='uima.ruta.overview'> <tr><td> <a name="Overview"> <h2>Overview </h2> </a> </td></tr> <tr><td> <blockquote class="subsectionBody"> <p> This Apache UIMA™ component consists of two major parts: An Analysis Engine, which interprets and executes the rule-based scripting language, and the Eclipse-based tooling (Workbench), which provides various support for developing rules. </p> <ul> <li> <p> This page only contains a short overview. A more detailed introduction can be found in the <a href="#uima.ruta.documentation">documentation</a> or in the <a href="downloads/gscl2013/2013-GSCL-Ruta.pdf">slides of the tutorial</a> in conjunction with GSCL 2013. </p> </li> <li> <p> The UIMA Ruta Workbench can be installed via our Eclipse update site: <a href="https://downloads.apache.org/uima/eclipse-update-site-v3/">https://downloads.apache.org/uima/eclipse-update-site-v3/</a> </p> <p> The UIMA Ruta Workbench 3.5.0 is tested with Eclipse 2023-09 (older versions may still work). </p> </li> </ul> </blockquote> </td></tr> </table> <table class="subsectionTable" id='uima.ruta.language'> <tr><td> <a name="Rule Language"> <h2>Rule Language </h2> </a> </td></tr> <tr><td> <blockquote class="subsectionBody"> <p> The UIMA Ruta language is an imperative rule language extended with scripting elements. A rule defines a pattern of annotations with additional conditions. If this pattern applies, then the actions of the rule are performed on the matched annotations. A rule is composed of a sequence of rule elements and a rule element essentially consists of four parts: A matching condition, an optional quantifier, a list of conditions and a list of actions. The matching condition is typically a type of an annotation by which the rule element matches on the covered text of one of those annotations. The quantifier specifies, whether it is necessary that the rule element successfully matches and how often the rule element may match. The list of conditions specifies additional constraints that the matched text or annotations need to fulfill. The list of actions defines the consequences of the rule and often creates new annotations or modifies existing annotations. </p> <p> The following example rule consists of three rule elements. The first one (<code>ANY...</code>) matches on every token, which has a covered text that occurs in a word lists, named <code>MonthsList</code>. The second rule element (<code>PERIOD?</code>) is optional and does not need to be fulfilled, which is indicated by the quantifier <code>?</code>. The last rule element (<code>NUM...</code>) matches on numbers that fulfill the regular expression <code>REGEXP(".{2,4}"</code> and are therefore at least two characters to a maximum of four characters long. If this rule successfully matches on a text passage, then its three actions are executed: An annotation of the type <code>Month</code> is created for the first rule element, an annotation of the type <code>Year</code> is created for the last rule element and an annotation of the type <code>Date</code> is created for the span of all three rule elements. If the word list contains the correct entries, then this rule matches on strings like <code>Dec. 2004</code>, <code>July 85</code> or <code>11.2008</code> and creates the corresponding annotations. <pre>ANY{INLIST(MonthsList) -> MARK(Month), MARK(Date,1,3)} PERIOD? NUM{REGEXP(".{2,4}") -> MARK(Year)};</pre> </p> <p> Here is a short overview of additional features of the rule language: </p> <ul> <li>Expressions and variables</li> <li>Import and execution of external components</li> <li>Flexible matching with filtering</li> <li>Modularization in different files or blocks</li> <li>Control structures, e.g., for windowing</li> <li>Score-based extraction</li> <li>Modification</li> <li>Html support</li> <li>Dictionaries</li> <li>Extensible language definition</li> </ul> </blockquote> </td></tr> </table> <table class="subsectionTable" id='uima.ruta.workbench'> <tr><td> <a name="Workbench"> <h2>Workbench </h2> </a> </td></tr> <tr><td> <blockquote class="subsectionBody"> <p> The UIMA Ruta Workbench was created to facilitate all steps in creating Analysis Engines based on the UIMA Ruta language. Here is a short overview of included features: </p> <ul> <li> <p> <b>Editing support:</b> The full-featured editor for the UIMA Ruta language provides syntax and semantic highlighting, syntax checking, context-sensitive auto-completion, template-based completion, open declaration and more. </p> </li> <li> <p> <b>Rule Explanation:</b> Each step in the matching process can be explained: This includes how often a rule was applied, which condition was not fulfilled, or by which rule a specific annotation was created. Additionally, profile information about the runtime performance can be accessed. </p> </li> <li> <p> <b>Automatic Validation:</b> UIMA Ruta scripts can automatically validated against a set of annotated documents (F1 score, test-driven development) and even against unlabeled documents (constraint-driven evaluation). </p> </li> <li> <p> <b>Rule learning:</b> The supervised learning algorithms of the included TextRuler framework are able to induce rules and, therefore, enable semi-automatic development of rule-based components. </p> </li> <li> <p> <b>Query:</b> Rules can be used as query statements in order to investigate annotated documents. </p> </li> </ul> <img style="width: 75%; height: 75%" src="./images/ruta/ruta_workbench.png" alt="UIMA Ruta Workbench" /> </blockquote> </td></tr> </table> <table class="subsectionTable" id='uima.ruta.documentation'> <tr><td> <a name="Documentation"> <h2>Documentation </h2> </a> </td></tr> <tr><td> <blockquote class="subsectionBody"> <p> Here, you can find the documentation for the most recent UIMA Ruta release. </p> <p><b>Latest UIMA Ruta documentation</b></p> <ul> <li><a href="d/ruta-current/ruta.html">Apache UIMA Ruta Guide and Reference (v3.x)</a></li> <li><a href="d/ruta-current/RELEASE_NOTES.html">Latest release notes (v3.x)</a></li> </ul> <p>Should you require documentation for a specific version of uimaFIT, please check our <a href="https://svn.apache.org/repos/asf/uima/site/archive/docs/d">archive</a>.</p> </blockquote> </td></tr> </table> <table class="subsectionTable" id='uima.ruta.developer'> <tr><td> <a name="Developer Information"> <h2>Developer Information </h2> </a> </td></tr> <tr><td> <blockquote class="subsectionBody"> <p>The latest version of UIMA Ruta is available via <a href="https://search.maven.org/#search%7Cga%7C1%7Cruta">Maven Central</a>. If you use Maven as your build tool, then you can add the basic UIMA Ruta functionality as a dependency in your pom.xml file (additionally to other UIMA dependencies)</p> <pre> <dependency> <groupId>org.apache.uima</groupId> <artifactId>ruta-core</artifactId> <version>3.5.0</version> </dependency> </pre> <subsubsection> For building the UIMA Ruta projects from sources, follow the instructions for <a href="building-uima.html">building UIMA</a>, but exchange the command for the git clone:<br /> <code>git clone https://github.com/apache/uima-ruta</code> </subsubsection> <p> The sources of the current release are available at the <a href="downloads.cgi">download page</a>. </p> </blockquote> </td></tr> </table> <table class="subsectionTable" id='uima.ruta.reference'> <tr><td> <a name="Reference"> <h2>Reference </h2> </a> </td></tr> <tr><td> <blockquote class="subsectionBody"> <p> If you use UIMA Ruta to support academic research, then please consider citing the following paper as appropriate:</p> <pre>@article{NLE:10051335, author = {Kluegl, Peter and Toepfer, Martin and Beck, Philip-Daniel and Fette, Georg and Puppe, Frank}, title = {UIMA Ruta: Rapid development of rule-based information extraction applications}, journal = {Natural Language Engineering}, volume = {22}, issue = {01}, month = {1}, year = {2016}, issn = {1469-8110}, pages = {1--40}, numpages = {40}, doi = {10.1017/S1351324914000114}, URL = {https://journals.cambridge.org/article_S1351324914000114}, } </pre> </blockquote> </td></tr> </table> </blockquote> </p> </td></tr> </table> </td> </tr> <!-- FOOTER --> <tr><td colspan="2"> <hr noshade="" size="1"/> </td></tr> <tr><td colspan="2"> <table class="pageFooter"> <tr> <td><a href="index.html">Home</a></td> <td><a href="privacy-policy.html">Privacy Policy</a></td> <td style="font-size:75%"> Copyright © 2006-2024, The Apache Software Foundation.<br/> Apache UIMA, UIMA, the Apache UIMA logo and the Apache Feather logo are trademarks of The Apache Software Foundation.<br/> All other marks mentioned may be trademarks or registered trademarks of their respective owners. </td> <td><a href="mailto:dev@uima.apache.org">Contact us</a></td> </tr> </table> </td></tr> </table> </body> </html>