CINXE.COM
Annif - tool for automated subject indexing
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Annif - tool for automated subject indexing</title> <meta name="title" content="Annif - tool for automated subject indexing"> <meta name="description" content="Annif is an open source toolkit for automated subject indexing. It integrates several machine learning and AI based algorithms for text classification."> <meta property="og:title" content="Annif - tool for automated subject indexing"> <meta property="og:description" content="Annif is an open source toolkit for automated subject indexing. It integrates several machine learning and AI based algorithms for text classification."> <meta property="og:type" content="website"> <meta property="og:image" content="https://annif.org/static/img/annif-social.png"> <meta name="twitter:card" content="summary_large_image"> <meta name="viewport" content="width=device-width, initial-scale=1"> <link rel="stylesheet" href="static/css/bootstrap.min.css"> <link rel="stylesheet" href="static/css/fonts.css"> <link rel="icon" href="favicon.ico"> <script src="static/js/jquery.min.js"></script> <script src="static/js/bootstrap.min.js"></script> <script src="static/js/annif.js"></script> <style> body { background-color: white; color: #343260; font-family: Jost, sans-serif; font-weight: 400; font-size: 1.2rem; line-height: 1.2 } header { background: #ffffff; background: -moz-linear-gradient(top, #ffffff 92%, #d9dfe3 92%, #ffffff 100%); background: -webkit-linear-gradient(top, #ffffff 92%, #d9dfe3 92%, #ffffff 100%); background: linear-gradient(to bottom, #ffffff 92%, #d9dfe3 92%, #ffffff 100%); padding-bottom: 5px; } a { color: #343260; text-decoration: underline; } a:hover, a:active { color: #6280dc; } a:visited { color: #536f9f } h1 { font-weight: 500; font-size: 2rem; padding: 0; margin: 0; } h2 { font-weight: 500; font-size: 1.5rem; padding: 0.5rem 0; } h3 { font-weight: 500; font-size: 1.3rem; padding: 0; } #blurb { font-size: 2.0rem; font-weight: 500; line-height: 1; } #howto { text-align: center; text-transform: uppercase; margin: 3rem 0 1rem 0; } #diagram { padding: 0; margin: 0 -15px; } #diagram li { list-style-type: none; padding: 11rem 1.5rem 3rem 1.5rem; margin: 0; text-align: center; background-position: center 3rem; background-repeat: no-repeat; background-size: 7.5rem; position: relative; } #diagram li:after { content: ""; display: block; background-image: url('static/img/arrow.svg'); background-size: 2rem; background-repeat: no-repeat; width: 5rem; height: 5rem; position: absolute; left: -1rem; top: 30%; } #diagram li:first-child:after { display: none; } #diagram-vocab { background-color: #ffba85; background-image: url('static/img/choose-vocabulary.svg'); } #diagram-prepare { background-color: #ff9182; background-image: url('static/img/prepare-training-data.svg'); } #diagram-train { background-color: #86eda2; background-image: url('static/img/load-vocabulary-and-train.svg'); } #diagram-index { background-color: #5dbe66; background-image: url('static/img/index-new-documents.svg'); } #try-demo { text-transform: uppercase; } #form { background-color: #f3f3f6; background: -moz-linear-gradient(top, #ffffff 0%, #d9dfe3 1%, #f3f3f6 1%, #f3f3f6 99%, #d9dfe3 99%, #ffffff 100%); background: -webkit-linear-gradient(top, #ffffff 0%, #d9dfe3 1%, #f3f3f6 1%, #f3f3f6 99%, #d9dfe3 99%, #ffffff 100%); background: linear-gradient(to bottom, #ffffff 0%, #d9dfe3 1%, #f3f3f6 1%, #f3f3f6 99%, #d9dfe3 99%, #ffffff 100%); } #text-box-wrapper { position: relative; } #text-box button { position: absolute; top: 10px; right: 10px; background-color: #ff9182; color: black; border: none; border-radius: 0px; padding: 2px 7px; } label, #suggestions, legend { border-top: 1px solid #343260; padding-top: 0.5rem; text-transform: uppercase; font-size: 1.1rem; display: block; } .form-control { border-radius: 0px; } select { -moz-appearance: none; -webkit-appearance: none; appearance: none; } .select-wrapper { position: relative; } .select-wrapper:after { content: '▼'; font-size: 10px; position: absolute; top: 11px; right: 9px; color: black; pointer-events: none; } fieldset.btn-group { width: 100%; display: block; } #limit-buttons label { border-radius: 50%; width: 2rem; height: 2rem; display: inline-flex; align-items: center; justify-content: center; background-color: #c7d9ed; color: #343260; border-width: 0px; font-weight: 500; } #limit-buttons input { display: inline; } #limit-buttons .active { background-color: #343260; color: white; } #animation { float: right; margin: 1.2rem 0 0 0; } #get-suggestions { margin: 2rem 0; background-color: #6280dc; color: white; border: none; border-radius: 0px; padding-right: 3rem; background-image: url('static/img/arrow-white.svg'); background-position: 97% center; background-repeat: no-repeat; } #get-suggestions:disabled { cursor: default; background-color: #6c757d; color: white; } #results li, #no-results li { border-radius: 0; } meter { width: 24px; } meter:-moz-meter-optimum::-moz-meter-bar { background: #6280dc; } meter::-webkit-meter-bar { border: none; border-radius: 0; height: 18px; background-color: #ccc; box-shadow: 0 12px 3px -5px #e6e6e6 inset; } meter::-webkit-meter-optimum-value { background: #6280dc; } #bottom-half, footer { background-color: #e7e7ec; } #get-learn-discuss h2 { background-position: left top; background-repeat: no-repeat; padding-top: 4rem; background-size: 3.5rem; } #get-annif { background-image: url('static/img/get-annif.svg'); } #learn-annif { background-image: url('static/img/learn-annif.svg'); } #discuss-annif { background-image: url('static/img/discuss-annif.svg'); } .version-name { min-width: 90px; display: inline-block; } .annif-user { border-bottom: 1px dotted #343260; } #finto-ai { max-width: 20%; margin: 0.75rem 0rem 0.75rem 0rem; } #yle { max-width: 20%; margin: 0rem 0rem 0rem -1rem; } #dnb { max-width: 20%; margin: 0.75rem 0rem 0.75rem 0rem; } #storia { max-width: 30%; margin: 0.75rem 0rem 0.75rem 0rem; } #kb { max-width: 20%; margin: 0.75rem 0rem 0.75rem 0rem; } #jyu { max-width: 40%; margin: 0.75rem 0rem 0.75rem 0rem; } #zbw { max-width: 40%; margin: 0.75rem 0rem 0.75rem 0rem; } #natlibpo { max-width: 35%; margin: 0.75rem 0rem 0.75rem 0rem; } #natlibfi { width: 12rem; } </style> </head> <body> <header class="mb-4 mt-0"> <div class="container"> <div class="row justify-content-between"> <div class="col-2"> <h1><img src="static/img/annif-RGB.svg" class="img-fluid" alt="Annif"></h1> </div> <div class="col-10 my-auto px-0"> <p class="text-right my-0">Tool for automated subject indexing and classification</p> </div> </div> </div> </header> <div class="container"> <div class="row justify-content-center"> <div class="col-md-6"> <p id="blurb" class="text-center"> Choose a controlled subject vocabulary and train Annif on already indexed documents – it can then suggest subjects for new documents! </p> </div> </div> <h2 id="howto">How to use Annif</h2> <ol class="row mb-4" id="diagram"> <li class="col-md-3" id="diagram-vocab">Choose subject vocabulary</li> <li class="col-md-3" id="diagram-prepare">Prepare a corpus from training data</li> <li class="col-md-3" id="diagram-train">Load the vocabulary and train a model</li> <li class="col-md-3" id="diagram-index">Suggest subjects for new documents</li> </ol> <div class="row"> <div class="col-md-4 py-4 offset-md-6"> <p>Annif uses a combination of existing <strong>natural language processing</strong> and <strong>machine learning</strong> tools including <a href="https://www.tensorflow.org/">TensorFlow</a>, <a href="https://github.com/tomtung/omikuji">Omikuji</a>, <a href="https://fasttext.cc/">fastText</a> and <a href="https://radimrehurek.com/gensim/">Gensim</a>. It is <strong>multilingual</strong> and can support <strong>any subject vocabulary</strong> (in SKOS or a simple TSV format). It provides a command-line interface, a simple Web UI and a microservice-style REST API.</p> </div> </div> </div> <div id="form"> <div class="container"> <div class="row justify-content-center pt-4"> <div class="col-md-"> <h2 id="try-demo">Try the demo!</h2> </div> </div> <div class="row pb-4"> <div class="col-md-8 mr-auto"> <div id="text-box" class="form-group"> <label for="text">Input text</label> <div id="text-box-wrapper"> <textarea class="form-control" rows="20" name="text" id="text" placeholder='Copy text here and press the button "Get suggestions"'></textarea> <button id="button-clear" type="button" class="btn btn-danger">✕</button> </div> </div> </div> <div class="col-md-4"> <div class="form-group"> <label for="project">Project (vocabulary and language)</label> <div class="select-wrapper"> <select class="form-control" id="project"> </select> </div> </div> <div class="form-group"> <fieldset id="limit-buttons" class="btn-group btn-group-toggle" data-toggle="buttons"> <legend data-i18n="limit">Max # of suggestions</legend> <label class="btn btn-secondary"> <input type="radio" name="limit" value="10" checked> 10 </label> <label class="btn btn-secondary"> <input type="radio" name="limit" value="15"> 15 </label> <label class="btn btn-secondary"> <input type="radio" name="limit" value="20"> 20 </label> </fieldset> </div> <img id="animation" src="static/img/annif-static.gif" alt=""> <button type="button" class="btn btn-primary" id="get-suggestions">Get suggestions</button> <div id="suggestions-wrapper"> <h2 id="suggestions" data-i18n="suggestions">Suggested subjects</h2> <div class="d-flex justify-content-center"> <div id="results-spinner" class="spinner-border m-2" role="status"> <span class="sr-only">Loading...</span> </div> </div> <ul class="list-group" id="results"> </ul> <ul class="list-group" id="no-results"> <li class="list-group-item p-0" data-i18n="no-results">Ei tuloksia</li> </ul> </div> </div> </div> </div> </div> <div class="container"> <div class="row my-4" id="get-learn-discuss"> <div class="col-md-4"> <h2 id="get-annif">Get Annif</h2> <p>Code and documentation for Annif is <a href="https://github.com/NatLibFi/Annif/">available on GitHub</a> (Apache 2.0 license). Annif can also be <a href="https://pypi.org/project/annif/">installed from PyPI</a> and as a <a href="https://quay.io/repository/natlibfi/annif">Docker image from Quay.io</a>. Annif is mainly being developed at the <a href="http://www.nationallibrary.fi">National Library of Finland</a>, but others are welcome to join in!</p> <h3>Latest releases</h3> <div id="latest-releases"></div> <ul class="release-list"></ul> <h3>Models</h3> <p>There is a <a href="https://huggingface.co/collections/NatLibFi/annif-models-65b35fb98b7c508c8e8a1570" >collection</a> of downloadable Annif models in the 🤗 Hugging Face Hub. </p> <script> $(document).ready(function() { $.ajax({ url: 'https://api.github.com/repos/NatLibFi/Annif/releases', headers: { 'Accept': 'application/vnd.github.v3+json', 'X-GitHub-Api-Version': '2022-11-28' }, success: function(data) { var releasesToShow = 5; // Set the number of releases to show var releasesAppended = 0; const dateFormat = { year: 'numeric', month: 'long', day: 'numeric' }; var $releaseList = $('.release-list'); // Process the API response and display the release information data.forEach(function(release) { if (releasesAppended >= releasesToShow) { return; } var name = release.name; var published = new Date(release.published_at).toLocaleString('default', dateFormat); var url = release.html_url; var releaseHtml = '<li>'; releaseHtml += '<a href="' + url + '" class="version-name">' + name + '</a> – ' + published; releaseHtml += '</li>'; $releaseList.append(releaseHtml); releasesAppended++; }); }, error: function() { $('#latest-releases').text('Failed to fetch the releases.'); } }); }); </script> </div> <div class="col-md-4"> <h2 id="learn-annif">Learn Annif</h2> <p>To get a hands-on experience of Annif, study the <a href="https://github.com/NatLibFi/Annif-tutorial">Annif tutorial materials</a>, which include example data sets, exercises and short video presentations:</p> <a href="https://www.youtube.com/playlist?list=PLa9kvrI3VLf5K-bjvVDaIWMi5CACGjPUM" target="_blank"> <img src="static/img/youtube-annif-tutorial.png" alt="Annif tutorial videos" style="width: 350px; height: auto;"> </a> <p>There is also extensive <a href="https://github.com/NatLibFi/Annif/wiki"> usage documentation</a> in the wiki on GitHub.</p> </div> <div class="col-md-4"> <h2 id="discuss-annif">Discuss Annif</h2> <p>The <a href="https://groups.google.com/forum/#!forum/annif-users">annif-users</a> mailing list and web forum is available on Google Groups. The forum is meant for general discussion about Annif, asking for help, and announcements of new versions. All messages are public and anyone is welcome to join!</p> <p>Please use the forum instead of sending personal e-mail to the Annif developers.</p> </div> </div> </div> <div id="bottom-half"> <div class="container py-4"> <div class="row"> <div class="col-md-12"> <h2>Current users</h2> </div> </div> <div class="row"> <div class="col-md-6 pr-4"> <div class="annif-user"> <a href="https://ai.finto.fi"> <img id="finto-ai" src="static/img/FintoAI-RGB.svg" class="img-fluid" alt="Finto AI logo"> </a> <p><a href="https://ai.finto.fi">Finto AI</a> - service for automated subject indexing.</p> </div> <div class="annif-user"> <a href="https://yle.fi"> <img id="yle" src="static/img/Yle-logo_RGB_turkoosi.png" class="img-fluid" alt="Yle logo"> </a> <p>Yle, the Finnish Broadcasting Company, uses Annif to <a href="https://yle.fi/aihe/a/20-10001817">assign tags to online news articles.</a></p> </div> <div class="annif-user"> <a href="https://www.dnb.de"> <img id="dnb" src="static/img/dnb.svg" class="img-fluid" alt="DNB logo"> </a> <p> The <a href="https://www.dnb.de" title="German National Library">German National Library</a> uses Annif as the core of its automated subject indexing system <a href="https://groups.google.com/g/annif-users/c/KVQB-hvLrbA/m/I9RwM9EPBgAJ">Erschließungsmaschine (EMa)</a>. </p> </div> <div> <a href="https://www.kb.se/"> <img id="kb" src="static/img/kb.png" class="img-fluid" alt="KB logo"> </a> <p> <a href="https://www.kb.se">National Library of Sweden</a> uses Annif for <a href="https://bibliometri.swepub.kb.se/classify">automated classification of scholarly publications</a>. </p> </div> </div> <div class="col-md-6 pl-4"> <div class="annif-user"> <a href="https://bn.org.pl"> <img id="natlibpo" src="static/img/national_library_of_poland.svg" class="img-fluid" alt="National library of Poland logo"> </a> <p><a href="https://bn.org.pl">The National Library of Poland</a> uses Annif as a part of the <a href="https://deskryptor.bn.org.pl/tag">DESKRYPTOR service</a> for automated subject indexing.</p> </div> <div class="annif-user"> <a href="https://www.storia.fi/"> <img id="storia" src="static/img/storia_logo.webp" class="img-fluid" alt="Storia Oy logo"> </a> <p><a href="https://www.storia.fi">Storia Oy</a> generates metadata about upcoming books with Annif.</p> </div> <div class="annif-user"> <a href="https://jyx.jyu.fi"> <img id="jyu" src="static/img/jyu.svg" class="img-fluid" alt="JYX logo"> </a> <p>In <a href="https://jyx.jyu.fi/">Jyväskylä University Digital Repository</a> and in repositories of other institutes (<a href="https://osuva.uwasa.fi">Osuva</a>, <a href="https://trepo.tuni.fi">Trepo</a>, <a href="https://www.theseus.fi">Theseus</a>, <a href="https://taju.uniarts.fi">Taju</a>, <a href="https://lauda.ulapland.fi/">Lauda</a>) Annif assists the subject indexing of theses and dissertations.</p> </div> <div class="annif-user"> <a href="https://www.zbw.eu/en/"> <img id="zbw" src="static/img/logo-zbw.gif" class="img-fluid" alt="ZBW logo"> </a> <p><a href="https://www.zbw.eu/en/">ZBW</a> – The Leibniz Information Centre for Economics uses Annif as a part of their automated indexing service AutoSE (<a href="https://www.zbw.eu/en/about-us/knowledge-organisation/automation-of-subject-indexing-using-methods-from-artificial-intelligence">read more here</a>).</p> </div> <div class="pt-3"> <p><a href="https://www.kiwi.fi/x/gIB7Cg">More users</a> of Annif and/or Finto AI</p> </div> </div> </div> </div> </div> <div class="container py-4"> <div class="row"> <div class="col-md-6 pr-4"> <h2>Publications</h2> <p><a href="https://www.emerald.com/insight/content/doi/10.1108/JD-01-2022-0026"> An article</a> that investigates the usage of Annif for Dewey Decimal Classification was published in 2024 in the <a href="https://www.emerald.com/insight/publication/issn/0022-0418">Journal of Documentation</a>. </p> <p><a href="https://www.jlis.it/index.php/jlis/article/view/437"> An article about Annif and Finto AI</a> has been published in 2022 in the peer-reviewed Open Access journal <a href="https://www.jlis.it">JLIS.it</a>. </p> <p> <a href="https://journal.code4lib.org/articles/16719"> Annif Analyzer Shootout paper</a> was also published in 2022 in <a href="https://journal.code4lib.org/">Code4Lib</a>. </p> <p> <a href="https://doi.org/10.18352/lq.10285">The first article about Annif</a> was published in 2019 in LIBER Quarterly. </p> <p> The software itself is also archived on Zenodo and has a <a href="https://doi.org/10.5281/zenodo.2578948">citable DOI</a>. See the <a href="https://github.com/NatLibFi/Annif#publications--how-to-cite">README</a> on the Annif GitHub project site for more details including BiBTeX snippets.</p> </div> <div class="col-md-6 pl-4"> <h2>Watch the videos</h2> <a href="https://www.youtube.com/watch?v=nzK97hzPMNE" target="_blank"> <img src="static/img/youtube-annif-presentation-swib20.png" alt="Annif and Finto AI Presentation" style="width: 400px; height: auto;"> </a> <p>Above is the presentation of Annif and Finto AI at <a href="http://swib.org/swib20/">SWIB20. </p> <p> See also <a href="https://www.youtube.com/watch?v=lSrFP3D-uTg">the SWIB18 presentation</a>, <a href="https://tech.ebu.ch/contents/publications/events/presentations/mdn2020/yle-meets-annif--an-open-source-tool-for-automated-subject-indexing"> MDN Workshop 2020 presentation</a>, and <a href="https://player.vimeo.com/video/212577974"> the video of the first prototype of Annif</a>. </p> </div> </div> </div> <footer> <div class="container"> <div class="row justify-content-center"> <div class="col-5 text-center"> <a href="https://www.kansalliskirjasto.fi/en/"><img src="static/img/natlibfi-logo.svg" alt="National Library of Finland" class="p-4" id="natlibfi"></a> <p>2020-2025 National Library of Finland</p> <p>See the <a href="accessibility.html">accessibility statement</a> of this website.</p> </div> </div> </div> </footer> <!-- Matomo --> <script> var _paq = window._paq = window._paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="https://tilasto.lib.helsinki.fi/"; _paq.push(['setTrackerUrl', u+'piwik.php']); _paq.push(['setSiteId', '26']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.async=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s); })(); </script> <!-- End Matomo Code --> </body> </html>