CINXE.COM

Search results for: small text extraction

<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <!-- Google tag (gtag.js) --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script> <!-- Yandex.Metrika counter --> <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript> <!-- /Yandex.Metrika counter --> <!-- Matomo --> <!-- End Matomo Code --> <title>Search results for: small text extraction</title> <meta name="description" content="Search results for: small text extraction"> <meta name="keywords" content="small text extraction"> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="small text extraction" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/abstracts/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value="small text extraction"> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 7837</div> </div> </div> </div> <h1 class="mt-3 mb-3 text-center" style="font-size:1.6rem;">Search results for: small text extraction</h1> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7837</span> Small Text Extraction from Documents and Chart Images</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Rominkumar%20Busa">Rominkumar Busa</a>, <a href="https://publications.waset.org/abstracts/search?q=Shahira%20K.%20C."> Shahira K. C.</a>, <a href="https://publications.waset.org/abstracts/search?q=Lijiya%20A."> Lijiya A.</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction" title="small text extraction">small text extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=OCR" title=" OCR"> OCR</a>, <a href="https://publications.waset.org/abstracts/search?q=scene%20text%20recognition" title=" scene text recognition"> scene text recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=CRNN" title=" CRNN"> CRNN</a> </p> <a href="https://publications.waset.org/abstracts/150310/small-text-extraction-from-documents-and-chart-images" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/150310.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">125</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7836</span> Urdu Text Extraction Method from Images</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Samabia%20Tehsin">Samabia Tehsin</a>, <a href="https://publications.waset.org/abstracts/search?q=Sumaira%20Kausar"> Sumaira Kausar</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Due to the vast increase in the multimedia data in recent years, efficient and robust retrieval techniques are needed to retrieve and index images/ videos. Text embedded in the images can serve as the strong retrieval tool for images. This is the reason that text extraction is an area of research with increasing attention. English text extraction is the focus of many researchers but very less work has been done on other languages like Urdu. This paper is focusing on Urdu text extraction from video frames. This paper presents a text detection feature set, which has the ability to deal up with most of the problems connected with the text extraction process. To test the validity of the method, it is tested on Urdu news dataset, which gives promising results. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=caption%20text" title="caption text">caption text</a>, <a href="https://publications.waset.org/abstracts/search?q=content-based%20image%20retrieval" title=" content-based image retrieval"> content-based image retrieval</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20analysis" title=" document analysis"> document analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20extraction" title=" text extraction"> text extraction</a> </p> <a href="https://publications.waset.org/abstracts/9566/urdu-text-extraction-method-from-images" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/9566.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">516</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7835</span> Extraction of Text Subtitles in Multimedia Systems</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Amarjit%20Singh">Amarjit Singh</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, a method for extraction of text subtitles in large video is proposed. The video data needs to be annotated for many multimedia applications. Text is incorporated in digital video for the motive of providing useful information about that video. So need arises to detect text present in video to understanding and video indexing. This is achieved in two steps. First step is text localization and the second step is text verification. The method of text detection can be extended to text recognition which finds applications in automatic video indexing; video annotation and content based video retrieval. The method has been tested on various types of videos. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=video" title="video">video</a>, <a href="https://publications.waset.org/abstracts/search?q=subtitles" title=" subtitles"> subtitles</a>, <a href="https://publications.waset.org/abstracts/search?q=extraction" title=" extraction"> extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=annotation" title=" annotation"> annotation</a>, <a href="https://publications.waset.org/abstracts/search?q=frames" title=" frames"> frames</a> </p> <a href="https://publications.waset.org/abstracts/24441/extraction-of-text-subtitles-in-multimedia-systems" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/24441.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">601</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7834</span> Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Faisal%20Alshuwaier">Faisal Alshuwaier</a>, <a href="https://publications.waset.org/abstracts/search?q=Ali%20Areshey"> Ali Areshey</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound method to simplify the texts. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=extraction" title="extraction">extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=max-prod" title=" max-prod"> max-prod</a>, <a href="https://publications.waset.org/abstracts/search?q=fuzzy%20relations" title=" fuzzy relations"> fuzzy relations</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20mining" title=" text mining"> text mining</a>, <a href="https://publications.waset.org/abstracts/search?q=memberships" title=" memberships"> memberships</a>, <a href="https://publications.waset.org/abstracts/search?q=classification" title=" classification"> classification</a>, <a href="https://publications.waset.org/abstracts/search?q=memberships" title=" memberships"> memberships</a>, <a href="https://publications.waset.org/abstracts/search?q=classification" title=" classification"> classification</a> </p> <a href="https://publications.waset.org/abstracts/23970/optimal-classifying-and-extracting-fuzzy-relationship-from-query-using-text-mining-techniques" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/23970.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">582</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7833</span> Graph-Based Semantical Extractive Text Analysis</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Mina%20Samizadeh">Mina Samizadeh</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=keyword%20extraction" title="keyword extraction">keyword extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=n-gram%20extraction" title=" n-gram extraction"> n-gram extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20summarization" title=" text summarization"> text summarization</a>, <a href="https://publications.waset.org/abstracts/search?q=topic%20clustering" title=" topic clustering"> topic clustering</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20analysis" title=" semantic analysis"> semantic analysis</a> </p> <a href="https://publications.waset.org/abstracts/160526/graph-based-semantical-extractive-text-analysis" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/160526.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">70</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7832</span> Video Text Information Detection and Localization in Lecture Videos Using Moments </h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Belkacem%20Soundes">Belkacem Soundes</a>, <a href="https://publications.waset.org/abstracts/search?q=Guezouli%20Larbi"> Guezouli Larbi</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This paper presents a robust and accurate method for text detection and localization over lecture videos. Frame regions are classified into text or background based on visual feature analysis. However, lecture video shows significant degradation mainly related to acquisition conditions, camera motion and environmental changes resulting in low quality videos. Hence, affecting feature extraction and description efficiency. Moreover, traditional text detection methods cannot be directly applied to lecture videos. Therefore, robust feature extraction methods dedicated to this specific video genre are required for robust and accurate text detection and extraction. Method consists of a three-step process: Slide region detection and segmentation; Feature extraction and non-text filtering. For robust and effective features extraction moment functions are used. Two distinct types of moments are used: orthogonal and non-orthogonal. For orthogonal Zernike Moments, both Pseudo Zernike moments are used, whereas for non-orthogonal ones Hu moments are used. Expressivity and description efficiency are given and discussed. Proposed approach shows that in general, orthogonal moments show high accuracy in comparison to the non-orthogonal one. Pseudo Zernike moments are more effective than Zernike with better computation time. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=text%20detection" title="text detection">text detection</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20localization" title=" text localization"> text localization</a>, <a href="https://publications.waset.org/abstracts/search?q=lecture%20videos" title=" lecture videos"> lecture videos</a>, <a href="https://publications.waset.org/abstracts/search?q=pseudo%20zernike%20moments" title=" pseudo zernike moments"> pseudo zernike moments</a> </p> <a href="https://publications.waset.org/abstracts/109549/video-text-information-detection-and-localization-in-lecture-videos-using-moments" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/109549.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">151</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7831</span> Literature Review on Text Comparison Techniques: Analysis of Text Extraction, Main Comparison and Visual Representation Tools</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Andriana%20Mkrtchyan">Andriana Mkrtchyan</a>, <a href="https://publications.waset.org/abstracts/search?q=Vahe%20Khlghatyan"> Vahe Khlghatyan</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The choice of a profession is one of the most important decisions people make throughout their life. With the development of modern science, technologies, and all the spheres existing in the modern world, more and more professions are being arisen that complicate even more the process of choosing. Hence, there is a need for a guiding platform to help people to choose a profession and the right career path based on their interests, skills, and personality. This review aims at analyzing existing methods of comparing PDF format documents and suggests that a 3-stage approach is implemented for the comparison, that is – 1. text extraction from PDF format documents, 2. comparison of the extracted text via NLP algorithms, 3. comparison representation using special shape and color psychology methodology. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=color%20psychology" title="color psychology">color psychology</a>, <a href="https://publications.waset.org/abstracts/search?q=data%20acquisition%2Fextraction" title=" data acquisition/extraction"> data acquisition/extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=data%20augmentation" title=" data augmentation"> data augmentation</a>, <a href="https://publications.waset.org/abstracts/search?q=disambiguation" title=" disambiguation"> disambiguation</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=outlier%20detection" title=" outlier detection"> outlier detection</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20similarity" title=" semantic similarity"> semantic similarity</a>, <a href="https://publications.waset.org/abstracts/search?q=text-mining" title=" text-mining"> text-mining</a>, <a href="https://publications.waset.org/abstracts/search?q=user%20evaluation" title=" user evaluation"> user evaluation</a>, <a href="https://publications.waset.org/abstracts/search?q=visual%20search" title=" visual search"> visual search</a> </p> <a href="https://publications.waset.org/abstracts/161588/literature-review-on-text-comparison-techniques-analysis-of-text-extraction-main-comparison-and-visual-representation-tools" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/161588.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">76</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7830</span> A Summary-Based Text Classification Model for Graph Attention Networks</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Shuo%20Liu">Shuo Liu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In Chinese text classification tasks, redundant words and phrases can interfere with the formation of extracted and analyzed text information, leading to a decrease in the accuracy of the classification model. To reduce irrelevant elements, extract and utilize text content information more efficiently and improve the accuracy of text classification models. In this paper, the text in the corpus is first extracted using the TextRank algorithm for abstraction, the words in the abstract are used as nodes to construct a text graph, and then the graph attention network (GAT) is used to complete the task of classifying the text. Testing on a Chinese dataset from the network, the classification accuracy was improved over the direct method of generating graph structures using text. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Chinese%20natural%20language%20processing" title="Chinese natural language processing">Chinese natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=abstract%20extraction" title=" abstract extraction"> abstract extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=graph%20attention%20network" title=" graph attention network"> graph attention network</a> </p> <a href="https://publications.waset.org/abstracts/158060/a-summary-based-text-classification-model-for-graph-attention-networks" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/158060.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">100</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7829</span> Text Data Preprocessing Library: Bilingual Approach</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Kabil%20Boukhari">Kabil Boukhari</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In the context of information retrieval, the selection of the most relevant words is a very important step. In fact, the text cleaning allows keeping only the most representative words for a better use. In this paper, we propose a library for the purpose text preprocessing within an implemented application to facilitate this task. This study has two purposes. The first, is to present the related work of the various steps involved in text preprocessing, presenting the segmentation, stemming and lemmatization algorithms that could be efficient in the rest of study. The second, is to implement a developed tool for text preprocessing in French and English. This library accepts unstructured text as input and provides the preprocessed text as output, based on a set of rules and on a base of stop words for both languages. The proposed library has been made on different corpora and gave an interesting result. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=text%20preprocessing" title="text preprocessing">text preprocessing</a>, <a href="https://publications.waset.org/abstracts/search?q=segmentation" title=" segmentation"> segmentation</a>, <a href="https://publications.waset.org/abstracts/search?q=knowledge%20extraction" title=" knowledge extraction"> knowledge extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=normalization" title=" normalization"> normalization</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20generation" title=" text generation"> text generation</a>, <a href="https://publications.waset.org/abstracts/search?q=information%20retrieval" title=" information retrieval"> information retrieval</a> </p> <a href="https://publications.waset.org/abstracts/150846/text-data-preprocessing-library-bilingual-approach" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/150846.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">94</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7828</span> A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Smita%20Pallavi">Smita Pallavi</a>, <a href="https://publications.waset.org/abstracts/search?q=Raj%20Ratn%20Pranesh"> Raj Ratn Pranesh</a>, <a href="https://publications.waset.org/abstracts/search?q=Sumit%20Kumar"> Sumit Kumar</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=table%20extraction" title="table extraction">table extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=optical%20character%20recognition" title=" optical character recognition"> optical character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=image%20processing" title=" image processing"> image processing</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20extraction" title=" text extraction"> text extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=morphological%20transformation" title=" morphological transformation"> morphological transformation</a> </p> <a href="https://publications.waset.org/abstracts/127575/a-conglomerate-of-multiple-optical-character-recognition-table-detection-and-extraction" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/127575.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">143</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7827</span> Towards Logical Inference for the Arabic Question-Answering</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Wided%20Bakari">Wided Bakari</a>, <a href="https://publications.waset.org/abstracts/search?q=Patrice%20Bellot"> Patrice Bellot</a>, <a href="https://publications.waset.org/abstracts/search?q=Omar%20Trigui"> Omar Trigui</a>, <a href="https://publications.waset.org/abstracts/search?q=Mahmoud%20Neji"> Mahmoud Neji</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This article constitutes an opening to think of the modeling and analysis of Arabic texts in the context of a question-answer system. It is a question of exceeding the traditional approaches focused on morphosyntactic approaches. Furthermore, we present a new approach that analyze a text in order to extract correct answers then transform it to logical predicates. In addition, we would like to represent different levels of information within a text to answer a question and choose an answer among several proposed. To do so, we transform both the question and the text into logical forms. Then, we try to recognize all entailment between them. The results of recognizing the entailment are a set of text sentences that can implicate the user’s question. Our work is now concentrated on an implementation step in order to develop a system of question-answering in Arabic using techniques to recognize textual implications. In this context, the extraction of text features (keywords, named entities, and relationships that link them) is actually considered the first step in our process of text modeling. The second one is the use of techniques of textual implication that relies on the notion of inference and logic representation to extract candidate answers. The last step is the extraction and selection of the desired answer. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=NLP" title="NLP">NLP</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic%20language" title=" Arabic language"> Arabic language</a>, <a href="https://publications.waset.org/abstracts/search?q=question-answering" title=" question-answering"> question-answering</a>, <a href="https://publications.waset.org/abstracts/search?q=recognition%20text%20entailment" title=" recognition text entailment"> recognition text entailment</a>, <a href="https://publications.waset.org/abstracts/search?q=logic%20forms" title=" logic forms"> logic forms</a> </p> <a href="https://publications.waset.org/abstracts/27265/towards-logical-inference-for-the-arabic-question-answering" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/27265.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">342</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7826</span> Anatomical Survey for Text Pattern Detection</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=S.%20Tehsin">S. Tehsin</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Kausar"> S. Kausar</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The ultimate aim of machine intelligence is to explore and materialize the human capabilities, one of which is the ability to detect various text objects within one or more images displayed on any canvas including prints, videos or electronic displays. Multimedia data has increased rapidly in past years. Textual information present in multimedia contains important information about the image/video content. However, it needs to technologically testify the commonly used human intelligence of detecting and differentiating the text within an image, for computers. Hence in this paper feature set based on anatomical study of human text detection system is proposed. Subsequent examination bears testimony to the fact that the features extracted proved instrumental to text detection. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=biologically%20inspired%20vision" title="biologically inspired vision">biologically inspired vision</a>, <a href="https://publications.waset.org/abstracts/search?q=content%20based%20retrieval" title=" content based retrieval"> content based retrieval</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20analysis" title=" document analysis"> document analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20extraction" title=" text extraction"> text extraction</a> </p> <a href="https://publications.waset.org/abstracts/9629/anatomical-survey-for-text-pattern-detection" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/9629.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">444</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7825</span> Exploratory Analysis of A Review of Nonexistence Polarity in Native Speech</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Deawan%20Rakin%20Ahamed%20Remal">Deawan Rakin Ahamed Remal</a>, <a href="https://publications.waset.org/abstracts/search?q=Sinthia%20Chowdhury"> Sinthia Chowdhury</a>, <a href="https://publications.waset.org/abstracts/search?q=Sharun%20Akter%20Khushbu"> Sharun Akter Khushbu</a>, <a href="https://publications.waset.org/abstracts/search?q=Sheak%20Rashed%20Haider%20Noori"> Sheak Rashed Haider Noori</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Native Speech to text synthesis has its own leverage for the purpose of mankind. The extensive nature of art to speaking different accents is common but the purpose of communication between two different accent types of people is quite difficult. This problem will be motivated by the extraction of the wrong perception of language meaning. Thus, many existing automatic speech recognition has been placed to detect text. Overall study of this paper mentions a review of NSTTR (Native Speech Text to Text Recognition) synthesis compared with Text to Text recognition. Review has exposed many text to text recognition systems that are at a very early stage to comply with the system by native speech recognition. Many discussions started about the progression of chatbots, linguistic theory another is rule based approach. In the Recent years Deep learning is an overwhelming chapter for text to text learning to detect language nature. To the best of our knowledge, In the sub continent a huge number of people speak in Bangla language but they have different accents in different regions therefore study has been elaborate contradictory discussion achievement of existing works and findings of future needs in Bangla language acoustic accent. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=TTR" title="TTR">TTR</a>, <a href="https://publications.waset.org/abstracts/search?q=NSTTR" title=" NSTTR"> NSTTR</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20to%20text%20recognition" title=" text to text recognition"> text to text recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a> </p> <a href="https://publications.waset.org/abstracts/149060/exploratory-analysis-of-a-review-of-nonexistence-polarity-in-native-speech" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/149060.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">132</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7824</span> Mask-Prompt-Rerank: An Unsupervised Method for Text Sentiment Transfer</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yufen%20Qin">Yufen Qin</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text sentiment transfer is an important branch of text style transfer. The goal is to generate text with another sentiment attribute based on a text with a specific sentiment attribute while maintaining the content and semantic information unrelated to sentiment unchanged in the process. There are currently two main challenges in this field: no parallel corpus and text attribute entanglement. In response to the above problems, this paper proposed a novel solution: Mask-Prompt-Rerank. Use the method of masking the sentiment words and then using prompt regeneration to transfer the sentence sentiment. Experiments on two sentiment benchmark datasets and one formality transfer benchmark dataset show that this approach makes the performance of small pre-trained language models comparable to that of the most advanced large models, while consuming two orders of magnitude less computing and memory. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=language%20model" title="language model">language model</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=prompt" title=" prompt"> prompt</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20sentiment%20transfer" title=" text sentiment transfer"> text sentiment transfer</a> </p> <a href="https://publications.waset.org/abstracts/173904/mask-prompt-rerank-an-unsupervised-method-for-text-sentiment-transfer" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/173904.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">81</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7823</span> A U-Net Based Architecture for Fast and Accurate Diagram Extraction</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Revoti%20Prasad%20Bora">Revoti Prasad Bora</a>, <a href="https://publications.waset.org/abstracts/search?q=Saurabh%20Yadav"> Saurabh Yadav</a>, <a href="https://publications.waset.org/abstracts/search?q=Nikita%20Katyal"> Nikita Katyal</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In the context of educational data mining, the use case of extracting information from images containing both text and diagrams is of high importance. Hence, document analysis requires the extraction of diagrams from such images and processes the text and diagrams separately. To the author’s best knowledge, none among plenty of approaches for extracting tables, figures, etc., suffice the need for real-time processing with high accuracy as needed in multiple applications. In the education domain, diagrams can be of varied characteristics viz. line-based i.e. geometric diagrams, chemical bonds, mathematical formulas, etc. There are two broad categories of approaches that try to solve similar problems viz. traditional computer vision based approaches and deep learning approaches. The traditional computer vision based approaches mainly leverage connected components and distance transform based processing and hence perform well in very limited scenarios. The existing deep learning approaches either leverage YOLO or faster-RCNN architectures. These approaches suffer from a performance-accuracy tradeoff. This paper proposes a U-Net based architecture that formulates the diagram extraction as a segmentation problem. The proposed method provides similar accuracy with a much faster extraction time as compared to the mentioned state-of-the-art approaches. Further, the segmentation mask in this approach allows the extraction of diagrams of irregular shapes. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=computer%20vision" title="computer vision">computer vision</a>, <a href="https://publications.waset.org/abstracts/search?q=deep-learning" title=" deep-learning"> deep-learning</a>, <a href="https://publications.waset.org/abstracts/search?q=educational%20data%20mining" title=" educational data mining"> educational data mining</a>, <a href="https://publications.waset.org/abstracts/search?q=faster-RCNN" title=" faster-RCNN"> faster-RCNN</a>, <a href="https://publications.waset.org/abstracts/search?q=figure%20extraction" title=" figure extraction"> figure extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=image%20segmentation" title=" image segmentation"> image segmentation</a>, <a href="https://publications.waset.org/abstracts/search?q=real-time%20document%20analysis" title=" real-time document analysis"> real-time document analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20extraction" title=" text extraction"> text extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=U-Net" title=" U-Net"> U-Net</a>, <a href="https://publications.waset.org/abstracts/search?q=YOLO" title=" YOLO"> YOLO</a> </p> <a href="https://publications.waset.org/abstracts/148396/a-u-net-based-architecture-for-fast-and-accurate-diagram-extraction" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/148396.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">137</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7822</span> A Deep Learning Approach to Subsection Identification in Electronic Health Records</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Nitin%20Shravan">Nitin Shravan</a>, <a href="https://publications.waset.org/abstracts/search?q=Sudarsun%20Santhiappan"> Sudarsun Santhiappan</a>, <a href="https://publications.waset.org/abstracts/search?q=B.%20Sivaselvan"> B. Sivaselvan </a> </p> <p class="card-text"><strong>Abstract:</strong></p> Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title="deep learning">deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20clinical%20classification" title=" semantic clinical classification"> semantic clinical classification</a>, <a href="https://publications.waset.org/abstracts/search?q=subsection%20identification" title=" subsection identification"> subsection identification</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a> </p> <a href="https://publications.waset.org/abstracts/109176/a-deep-learning-approach-to-subsection-identification-in-electronic-health-records" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/109176.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">217</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7821</span> Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yassine%20Jamoussi">Yassine Jamoussi</a>, <a href="https://publications.waset.org/abstracts/search?q=Ameni%20Youssfi"> Ameni Youssfi</a>, <a href="https://publications.waset.org/abstracts/search?q=Henda%20Ben%20Ghezala"> Henda Ben Ghezala</a> </p> <p class="card-text"><strong>Abstract:</strong></p> With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=social%20networking" title="social networking">social networking</a>, <a href="https://publications.waset.org/abstracts/search?q=information%20extraction" title=" information extraction"> information extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=part-of-speech%20tagging" title=" part-of-speech tagging"> part-of-speech tagging</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a> </p> <a href="https://publications.waset.org/abstracts/51464/extracting-actions-with-improved-part-of-speech-tagging-for-social-networking-texts" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/51464.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">305</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7820</span> Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features </h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Kyi%20Pyar%20Zaw">Kyi Pyar Zaw</a>, <a href="https://publications.waset.org/abstracts/search?q=Zin%20Mar%20Kyu"> Zin Mar Kyu </a> </p> <p class="card-text"><strong>Abstract:</strong></p> Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=chain%20code%20frequency" title="chain code frequency">chain code frequency</a>, <a href="https://publications.waset.org/abstracts/search?q=character%20recognition" title=" character recognition"> character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20extraction" title=" feature extraction"> feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=features%20matching" title=" features matching"> features matching</a>, <a href="https://publications.waset.org/abstracts/search?q=segmentation" title=" segmentation"> segmentation</a> </p> <a href="https://publications.waset.org/abstracts/77278/myanmar-character-recognition-using-eight-direction-chain-code-frequency-features" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/77278.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">320</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7819</span> Event Extraction, Analysis, and Event Linking</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Anam%20Alam">Anam Alam</a>, <a href="https://publications.waset.org/abstracts/search?q=Rahim%20Jamaluddin%20Kanji"> Rahim Jamaluddin Kanji</a> </p> <p class="card-text"><strong>Abstract:</strong></p> With the rapid growth of event in everywhere, event extraction has now become an important matter to retrieve the information from the unstructured data. One of the challenging problems is to extract the event from it. An event is an observable occurrence of interaction among entities. The paper investigates the effectiveness of event extraction capabilities of three software tools that are Wandora, Nitro and SPSS. We performed standard text mining techniques of these tools on the data sets of (i) Afghan War Diaries (AWD collection), (ii) MUC4 and (iii) WebKB. Information retrieval measures such as precision and recall which are computed under extensive set of experiments for Event Extraction. The experimental study analyzes the difference between events extracted by the software and human. This approach helps to construct an algorithm that will be applied for different machine learning methods. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=event%20extraction" title="event extraction">event extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=Wandora" title=" Wandora"> Wandora</a>, <a href="https://publications.waset.org/abstracts/search?q=nitro" title=" nitro"> nitro</a>, <a href="https://publications.waset.org/abstracts/search?q=SPSS" title=" SPSS"> SPSS</a>, <a href="https://publications.waset.org/abstracts/search?q=event%20analysis" title=" event analysis"> event analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=extraction%20method" title=" extraction method"> extraction method</a>, <a href="https://publications.waset.org/abstracts/search?q=AFG" title=" AFG"> AFG</a>, <a href="https://publications.waset.org/abstracts/search?q=Afghan%20War%20Diaries" title=" Afghan War Diaries"> Afghan War Diaries</a>, <a href="https://publications.waset.org/abstracts/search?q=MUC4" title=" MUC4"> MUC4</a>, <a href="https://publications.waset.org/abstracts/search?q=4%20universities" title=" 4 universities"> 4 universities</a>, <a href="https://publications.waset.org/abstracts/search?q=dataset" title=" dataset"> dataset</a>, <a href="https://publications.waset.org/abstracts/search?q=algorithm" title=" algorithm"> algorithm</a>, <a href="https://publications.waset.org/abstracts/search?q=precision" title=" precision"> precision</a>, <a href="https://publications.waset.org/abstracts/search?q=recall" title=" recall"> recall</a>, <a href="https://publications.waset.org/abstracts/search?q=evaluation" title=" evaluation"> evaluation</a> </p> <a href="https://publications.waset.org/abstracts/17112/event-extraction-analysis-and-event-linking" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/17112.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">596</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7818</span> Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Jaspreet%20Singh">Jaspreet Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Gurvinder%20Singh"> Gurvinder Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Prabhsimran%20Singh"> Prabhsimran Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Rajinder%20Singh"> Rajinder Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Prithvipal%20Singh"> Prithvipal Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Karanjeet%20Singh%20Kahlon"> Karanjeet Singh Kahlon</a>, <a href="https://publications.waset.org/abstracts/search?q=Ravinder%20Singh%20Sawhney"> Ravinder Singh Sawhney</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20neural%20network" title="deep neural network">deep neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=farmer%20suicides" title=" farmer suicides"> farmer suicides</a>, <a href="https://publications.waset.org/abstracts/search?q=morphological%20processing" title=" morphological processing"> morphological processing</a>, <a href="https://publications.waset.org/abstracts/search?q=punjabi%20text" title=" punjabi text"> punjabi text</a>, <a href="https://publications.waset.org/abstracts/search?q=sentiment%20analysis" title=" sentiment analysis"> sentiment analysis</a> </p> <a href="https://publications.waset.org/abstracts/88605/morphological-processing-of-punjabi-text-for-sentiment-analysis-of-farmer-suicides" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/88605.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">326</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7817</span> Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Anthony%20Proschka">Anthony Proschka</a>, <a href="https://publications.waset.org/abstracts/search?q=Deepak%20Mishra"> Deepak Mishra</a>, <a href="https://publications.waset.org/abstracts/search?q=Merlyn%20Ramanan"> Merlyn Ramanan</a>, <a href="https://publications.waset.org/abstracts/search?q=Zurab%20Baratashvili"> Zurab Baratashvili</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=data%20mining" title="data mining">data mining</a>, <a href="https://publications.waset.org/abstracts/search?q=information%20retrieval" title=" information retrieval"> information retrieval</a>, <a href="https://publications.waset.org/abstracts/search?q=business" title=" business"> business</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20extraction" title=" feature extraction"> feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=layout" title=" layout"> layout</a>, <a href="https://publications.waset.org/abstracts/search?q=business%20data%20processing" title=" business data processing"> business data processing</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20handling" title=" document handling"> document handling</a>, <a href="https://publications.waset.org/abstracts/search?q=end-user%20trained%20information%20extraction" title=" end-user trained information extraction"> end-user trained information extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20archiving" title=" document archiving"> document archiving</a>, <a href="https://publications.waset.org/abstracts/search?q=scanned%20business%20documents" title=" scanned business documents"> scanned business documents</a>, <a href="https://publications.waset.org/abstracts/search?q=automated%20document%20processing" title=" automated document processing"> automated document processing</a>, <a href="https://publications.waset.org/abstracts/search?q=F1-measure" title=" F1-measure"> F1-measure</a>, <a href="https://publications.waset.org/abstracts/search?q=commercial%20accounting%20software" title=" commercial accounting software"> commercial accounting software</a> </p> <a href="https://publications.waset.org/abstracts/128370/visual-template-detection-and-compositional-automatic-regular-expression-generation-for-business-invoice-extraction" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/128370.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">130</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7816</span> A Clustering Algorithm for Massive Texts</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ming%20Liu">Ming Liu</a>, <a href="https://publications.waset.org/abstracts/search?q=Chong%20Wu"> Chong Wu</a>, <a href="https://publications.waset.org/abstracts/search?q=Bingquan%20Liu"> Bingquan Liu</a>, <a href="https://publications.waset.org/abstracts/search?q=Lei%20Chen"> Lei Chen</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=vector%20reconstruction" title="vector reconstruction">vector reconstruction</a>, <a href="https://publications.waset.org/abstracts/search?q=large-scale%20text%20clustering" title=" large-scale text clustering"> large-scale text clustering</a>, <a href="https://publications.waset.org/abstracts/search?q=partial%20tuning%20sub-process" title=" partial tuning sub-process"> partial tuning sub-process</a>, <a href="https://publications.waset.org/abstracts/search?q=overall%20tuning%20sub-process" title=" overall tuning sub-process"> overall tuning sub-process</a> </p> <a href="https://publications.waset.org/abstracts/22681/a-clustering-algorithm-for-massive-texts" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/22681.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">435</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7815</span> Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Rabi%20Mouhcine">Rabi Mouhcine</a>, <a href="https://publications.waset.org/abstracts/search?q=Amrouch%20Mustapha"> Amrouch Mustapha</a>, <a href="https://publications.waset.org/abstracts/search?q=Mahani%20Zouhir"> Mahani Zouhir</a>, <a href="https://publications.waset.org/abstracts/search?q=Mammass%20Driss"> Mammass Driss</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=recognition" title="recognition">recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=handwriting" title=" handwriting"> handwriting</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic%20text" title=" Arabic text"> Arabic text</a>, <a href="https://publications.waset.org/abstracts/search?q=HMMs" title=" HMMs"> HMMs</a>, <a href="https://publications.waset.org/abstracts/search?q=embedded%20training" title=" embedded training"> embedded training</a> </p> <a href="https://publications.waset.org/abstracts/54405/recognition-of-cursive-arabic-handwritten-text-using-embedded-training-based-on-hidden-markov-models-hmms" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/54405.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">354</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7814</span> Mechanisms of Ginger Bioactive Compounds Extract Using Soxhlet and Accelerated Water Extraction</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=M.%20N.%20Azian">M. N. Azian</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20N.%20Ilia%20Anisa"> A. N. Ilia Anisa</a>, <a href="https://publications.waset.org/abstracts/search?q=Y.%20Iwai"> Y. Iwai</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The mechanism for extraction bioactive compounds from plant matrix is essential for optimizing the extraction process. As a benchmark technique, a soxhlet extraction has been utilized for discussing the mechanism and compared with an accelerated water extraction. The trends of both techniques show that the process involves extraction and degradation. The highest yields of 6-, 8-, 10-gingerols and 6-shogaol in soxhlet extraction were 13.948, 7.12, 10.312 and 2.306 mg/g, respectively. The optimum 6-, 8-, 10-gingerols and 6-shogaol extracted by the accelerated water extraction at 140oC were 68.97±3.95 mg/g at 3min, 18.98±3.04 mg/g at 5min, 5.167±2.35 mg/g at 3min and 14.57±6.27 mg/g at 3min, respectively. The effect of temperature at 3mins shows that the concentration of 6-shogaol increased rapidly as decreasing the recovery of 6-gingerol. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=mechanism" title="mechanism">mechanism</a>, <a href="https://publications.waset.org/abstracts/search?q=ginger%20bioactive%20compounds" title=" ginger bioactive compounds"> ginger bioactive compounds</a>, <a href="https://publications.waset.org/abstracts/search?q=soxhlet%20extraction" title=" soxhlet extraction"> soxhlet extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=accelerated%20water%20extraction" title=" accelerated water extraction"> accelerated water extraction</a> </p> <a href="https://publications.waset.org/abstracts/9278/mechanisms-of-ginger-bioactive-compounds-extract-using-soxhlet-and-accelerated-water-extraction" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/9278.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">434</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7813</span> Information Extraction for Short-Answer Question for the University of the Cordilleras</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Thelma%20Palaoag">Thelma Palaoag</a>, <a href="https://publications.waset.org/abstracts/search?q=Melanie%20Basa"> Melanie Basa</a>, <a href="https://publications.waset.org/abstracts/search?q=Jezreel%20Mark%20Panilo"> Jezreel Mark Panilo</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=information%20extraction" title="information extraction">information extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=short-answer%20question" title=" short-answer question"> short-answer question</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=application" title=" application"> application</a> </p> <a href="https://publications.waset.org/abstracts/62355/information-extraction-for-short-answer-question-for-the-university-of-the-cordilleras" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/62355.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">428</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7812</span> OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=A.%20R.%20Bagirzade">A. R. Bagirzade</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Sh.%20Najafova"> A. Sh. Najafova</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20M.%20Yessirkepova"> S. M. Yessirkepova</a>, <a href="https://publications.waset.org/abstracts/search?q=E.%20S.%20Albert"> E. S. Albert</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=ABBYY%20FineReader%20system" title="ABBYY FineReader system">ABBYY FineReader system</a>, <a href="https://publications.waset.org/abstracts/search?q=algorithm%20symbol%20recognition" title=" algorithm symbol recognition"> algorithm symbol recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=OCR%2FICR%20techniques" title=" OCR/ICR techniques"> OCR/ICR techniques</a>, <a href="https://publications.waset.org/abstracts/search?q=recognition%20technologies" title=" recognition technologies"> recognition technologies</a> </p> <a href="https://publications.waset.org/abstracts/130255/ocricr-text-recognition-using-abbyy-finereader-as-an-example-text" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/130255.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">168</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7811</span> An Event Relationship Extraction Method Incorporating Deep Feedback Recurrent Neural Network and Bidirectional Long Short-Term Memory</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yin%20Yuanling">Yin Yuanling</a> </p> <p class="card-text"><strong>Abstract:</strong></p> A Deep Feedback Recurrent Neural Network (DFRNN) and Bidirectional Long Short-Term Memory (BiLSTM) are designed to address the problem of low accuracy of traditional relationship extraction models. This method combines a deep feedback-based recurrent neural network (DFRNN) with a bi-directional long short-term memory (BiLSTM) approach. The method combines DFRNN, which extracts local features of text based on deep feedback recurrent mechanism, BiLSTM, which better extracts global features of text, and Self-Attention, which extracts semantic information. Experiments show that the method achieves an F1 value of 76.69% on the CEC dataset, which is 0.0652 better than the BiLSTM+Self-ATT model, thus optimizing the performance of the deep learning method in the event relationship extraction task. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=event%20relations" title="event relations">event relations</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=DFRNN%20models" title=" DFRNN models"> DFRNN models</a>, <a href="https://publications.waset.org/abstracts/search?q=bi-directional%20long%20and%20short-term%20memory%20networks" title=" bi-directional long and short-term memory networks"> bi-directional long and short-term memory networks</a> </p> <a href="https://publications.waset.org/abstracts/156673/an-event-relationship-extraction-method-incorporating-deep-feedback-recurrent-neural-network-and-bidirectional-long-short-term-memory" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/156673.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">144</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7810</span> Recognition of Grocery Products in Images Captured by Cellular Phones</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Farshideh%20Einsele">Farshideh Einsele</a>, <a href="https://publications.waset.org/abstracts/search?q=Hassan%20Foroosh"> Hassan Foroosh</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=camera-based%20OCR" title="camera-based OCR">camera-based OCR</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20extraction" title=" feature extraction"> feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=document" title=" document"> document</a>, <a href="https://publications.waset.org/abstracts/search?q=image%20processing" title=" image processing"> image processing</a>, <a href="https://publications.waset.org/abstracts/search?q=grocery%20products" title=" grocery products"> grocery products</a> </p> <a href="https://publications.waset.org/abstracts/15852/recognition-of-grocery-products-in-images-captured-by-cellular-phones" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/15852.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">406</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7809</span> Programmed Speech to Text Summarization Using Graph-Based Algorithm</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hamsini%20Pulugurtha">Hamsini Pulugurtha</a>, <a href="https://publications.waset.org/abstracts/search?q=P.%20V.%20S.%20L.%20Jagadamba"> P. V. S. L. Jagadamba</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Programmed Speech to Text and Text Summarization Using Graph-based Algorithms can be utilized in gatherings to get the short depiction of the gathering for future reference. This gives signature check utilizing Siamese neural organization to confirm the personality of the client and convert the client gave sound record which is in English into English text utilizing the discourse acknowledgment bundle given in python. At times just the outline of the gathering is required, the answer for this text rundown. Thus, the record is then summed up utilizing the regular language preparing approaches, for example, solo extractive text outline calculations <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Siamese%20neural%20network" title="Siamese neural network">Siamese neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=English%20speech" title=" English speech"> English speech</a>, <a href="https://publications.waset.org/abstracts/search?q=English%20text" title=" English text"> English text</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=unsupervised%20extractive%20text%20summarization" title=" unsupervised extractive text summarization"> unsupervised extractive text summarization</a> </p> <a href="https://publications.waset.org/abstracts/143079/programmed-speech-to-text-summarization-using-graph-based-algorithm" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/143079.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">218</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7808</span> A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hee-Jeong%20Ahn">Hee-Jeong Ahn</a>, <a href="https://publications.waset.org/abstracts/search?q=Kee-Won%20Kim"> Kee-Won Kim</a>, <a href="https://publications.waset.org/abstracts/search?q=Seung-Hoon%20Kim"> Seung-Hoon Kim</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=data%20mining" title="data mining">data mining</a>, <a href="https://publications.waset.org/abstracts/search?q=Korean%20linguistic%20feature" title=" Korean linguistic feature"> Korean linguistic feature</a>, <a href="https://publications.waset.org/abstracts/search?q=literary%20fiction" title=" literary fiction"> literary fiction</a>, <a href="https://publications.waset.org/abstracts/search?q=relationship%20extraction" title=" relationship extraction"> relationship extraction</a> </p> <a href="https://publications.waset.org/abstracts/47126/a-relationship-extraction-method-from-literary-fiction-considering-korean-linguistic-features" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/47126.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">380</span> </span> </div> </div> <ul class="pagination"> <li class="page-item disabled"><span class="page-link">&lsaquo;</span></li> <li class="page-item active"><span class="page-link">1</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=2">2</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=3">3</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=4">4</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=5">5</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=6">6</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=7">7</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=8">8</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=9">9</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=10">10</a></li> <li class="page-item disabled"><span class="page-link">...</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=261">261</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=262">262</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction&amp;page=2" rel="next">&rsaquo;</a></li> </ul> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">&copy; 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10