CINXE.COM

Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <!-- Google tag (gtag.js) --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script> <!-- Yandex.Metrika counter --> <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript> <!-- /Yandex.Metrika counter --> <!-- Matomo --> <!-- End Matomo Code --> <title>Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation</title> <meta name="description" content="Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation"> <meta name="keywords" content="Search algorithm, centroid, query, keyword, cooccurrence, categorisation."> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <meta name="citation_title" content="Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation"> <meta name="citation_author" content="Mario Kubek"> <meta name="citation_author" content="Herwig Unger"> <meta name="citation_publication_date" content="2019/03/01"> <meta name="citation_journal_title" content="International Journal of Computer and Information Engineering"> <meta name="citation_volume" content="13"> <meta name="citation_issue" content="4"> <meta name="citation_firstpage" content="178"> <meta name="citation_lastpage" content="184"> <meta name="citation_pdf_url" content="https://publications.waset.org/10010202/pdf"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value=""> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 33093</div> </div> </div> </div> <div class="card publication-listing mt-3 mb-3"> <h5 class="card-header" style="font-size:.9rem">Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/search?q=Mario%20Kubek">Mario Kubek</a>, <a href="https://publications.waset.org/search?q=Herwig%20Unger"> Herwig Unger</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution. <iframe src="https://publications.waset.org/10010202.pdf" style="width:100%; height:400px;" frameborder="0"></iframe> <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/search?q=Search%20algorithm" title="Search algorithm">Search algorithm</a>, <a href="https://publications.waset.org/search?q=centroid" title=" centroid"> centroid</a>, <a href="https://publications.waset.org/search?q=query" title=" query"> query</a>, <a href="https://publications.waset.org/search?q=keyword" title=" keyword"> keyword</a>, <a href="https://publications.waset.org/search?q=cooccurrence" title=" cooccurrence"> cooccurrence</a>, <a href="https://publications.waset.org/search?q=categorisation." title=" categorisation."> categorisation.</a> </p> <p class="card-text"><strong>Digital Object Identifier (DOI):</strong> <a href="https://doi.org/10.5281/zenodo.2643818" target="_blank">doi.org/10.5281/zenodo.2643818</a> </p> <a href="https://publications.waset.org/10010202/interactive-topic-oriented-search-support-by-a-centroid-based-text-categorisation" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/10010202/apa" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">APA</a> <a href="https://publications.waset.org/10010202/bibtex" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">BibTeX</a> <a href="https://publications.waset.org/10010202/chicago" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">Chicago</a> <a href="https://publications.waset.org/10010202/endnote" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">EndNote</a> <a href="https://publications.waset.org/10010202/harvard" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">Harvard</a> <a href="https://publications.waset.org/10010202/json" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">JSON</a> <a href="https://publications.waset.org/10010202/mla" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">MLA</a> <a href="https://publications.waset.org/10010202/ris" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">RIS</a> <a href="https://publications.waset.org/10010202/xml" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">XML</a> <a href="https://publications.waset.org/10010202/iso690" target="_blank" rel="nofollow" class="btn btn-primary btn-sm">ISO 690</a> <a href="https://publications.waset.org/10010202.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">623</span> </span> <p class="card-text"><strong>References:</strong></p> <br>[1] B. Sparrow, J. Liu and D. M. Wegner, Google effects on memory: Cognitive consequences of having information at our fingertips, In Science, Vol. 333, pp. 776–778, 2011. <br>[2] C. Cleverdon, The Cranfield Tests on Index Language Devices, In Readings in Information Retrieval, pp. 47–59, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. <br>[3] C. D. Manning, P. Raghavan and H. Sch¨utze, Introduction to Information Retrieval, Cambridge University Press, New York, NY, USA, 2008. <br>[4] J. B. Miller, Internet Technologies and Information Services, 2nd Edition, Libraries Unlimited, Santa Barbara, California, USA, 2014. <br>[5] A. van den Bosch, T. Bogers and M. de Kunder, Estimating search engine index size variability: a 9-year longitudinal study, In Scientometrics, Volume 107, Issue 2, pp. 839-856, 2016. <br>[6] M. Kleppmann, Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, O’Reilly Media, 2017. <br>[7] E. Pariser, The Filter Bubble: What the Internet Is Hiding from You, Penguin Group, 2011. <br>[8] G. Heyer, U. Quasthoff and T. Wittig, Text Mining: Wissensrohstoff Text – Konzepte, Algorithmen, Ergebnisse, W3L-Verlag, 2008. <br>[9] M. M. Kubek and H. Unger, Centroid Terms as Text Representatives, In Proceedings of the 2016 ACM Symposium on Document Engineering, DocEng ’16, pp. 99–102, ACM, New York, NY, USA, 2016. <br>[10] M. M. Kubek and H. Unger, Centroid Terms and their Use in Natural Language Processing, In Autonomous Systems 2016, Fortschritt-Berichte VDI, Reihe 10 Nr. 848, pp. 167–185, VDI-Verlag D¨usseldorf, 2016. <br>[11] M. Kubek, T. B¨ohme, and H. Unger, Empiric Experiments with Text Representing Centroids, In Lecture Notes on Information Theory, Vol. 5, No. 1, pp. 23–28, 2017. <br>[12] M. M. Kubek and H. Unger, Towards a Librarian of the Web, In Proceedings of the 2nd International Conference on Communication and Information Processing (ICCIP 2016), pp. 70–78, ACM, New York, NY, USA, 2016. <br>[13] M. M. Kubek and H. Unger, A Concept Supporting Resilient, Faulttolerant and Decentralised Search, In Autonomous Systems 2017, Fortschritt-Berichte VDI, Reihe 10 Nr. 857, pp. 20–31, VDI-Verlag D¨usseldorf, 2017. <br>[14] M. M. Kubek and H. Unger, Datasets and Analysis Results, http://www. docanalyser.de/search-corpora.zip, 2017. <br>[15] L. R. Dice, Measures of the Amount of Ecologic Association Between Species, In Ecology, Vol. 26, No. 3, pp. 297–302, 1945. <br>[16] Neo4j, Inc., Website of the Neo4j Graph Platform, https://neo4j.com, 2017. <br>[17] C. Biemann, S. Bordag and U. Quasthoff, Automatic Acquisition of Paradigmatic Relations using Iterated Co-occurrences, In Proceedings of LREC2004, pp. 967–970, Lisboa, Portugal, 2004. <br>[18] M. M. Kubek, DocAnalyser – Searching with Web Documents, In Autonomous Systems 2014, Fortschritt-Berichte VDI, Reihe 10 Nr. 835, pp. 221–234, VDI-Verlag D¨usseldorf, 2014. <br>[19] B. H. Bloom, Space/Time Trade-offs in Hash Coding with Allowable Errors, In Commun. ACM, Vol. 13, No. 7, pp. 422–426, ACM, New York, NY, USA, 1970. </div> </div> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">&copy; 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10