CINXE.COM

Search results for: text classification

<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <!-- Google tag (gtag.js) --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script> <!-- Yandex.Metrika counter --> <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript> <!-- /Yandex.Metrika counter --> <!-- Matomo --> <!-- End Matomo Code --> <title>Search results for: text classification</title> <meta name="description" content="Search results for: text classification"> <meta name="keywords" content="text classification"> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="text classification" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/abstracts/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value="text classification"> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 3352</div> </div> </div> </div> <h1 class="mt-3 mb-3 text-center" style="font-size:1.6rem;">Search results for: text classification</h1> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3352</span> A Summary-Based Text Classification Model for Graph Attention Networks</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Shuo%20Liu">Shuo Liu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In Chinese text classification tasks, redundant words and phrases can interfere with the formation of extracted and analyzed text information, leading to a decrease in the accuracy of the classification model. To reduce irrelevant elements, extract and utilize text content information more efficiently and improve the accuracy of text classification models. In this paper, the text in the corpus is first extracted using the TextRank algorithm for abstraction, the words in the abstract are used as nodes to construct a text graph, and then the graph attention network (GAT) is used to complete the task of classifying the text. Testing on a Chinese dataset from the network, the classification accuracy was improved over the direct method of generating graph structures using text. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Chinese%20natural%20language%20processing" title="Chinese natural language processing">Chinese natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=abstract%20extraction" title=" abstract extraction"> abstract extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=graph%20attention%20network" title=" graph attention network"> graph attention network</a> </p> <a href="https://publications.waset.org/abstracts/158060/a-summary-based-text-classification-model-for-graph-attention-networks" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/158060.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">100</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3351</span> Arabic Text Representation and Classification Methods: Current State of the Art</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Rami%20Ayadi">Rami Ayadi</a>, <a href="https://publications.waset.org/abstracts/search?q=Mohsen%20Maraoui"> Mohsen Maraoui</a>, <a href="https://publications.waset.org/abstracts/search?q=Mounir%20Zrigui"> Mounir Zrigui</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, we have presented a brief current state of the art for Arabic text representation and classification methods. We decomposed Arabic Task Classification into four categories. First we describe some algorithms applied to classification on Arabic text. Secondly, we cite all major works when comparing classification algorithms applied on Arabic text, after this, we mention some authors who proposing new classification methods and finally we investigate the impact of preprocessing on Arabic TC. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title="text classification">text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic" title=" Arabic"> Arabic</a>, <a href="https://publications.waset.org/abstracts/search?q=impact%20of%20preprocessing" title=" impact of preprocessing"> impact of preprocessing</a>, <a href="https://publications.waset.org/abstracts/search?q=classification%20algorithms" title=" classification algorithms"> classification algorithms</a> </p> <a href="https://publications.waset.org/abstracts/10277/arabic-text-representation-and-classification-methods-current-state-of-the-art" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/10277.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">469</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3350</span> Arabic Text Classification: Review Study</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=M.%20Hijazi">M. Hijazi</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Zeki"> A. Zeki</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Ismail"> A. Ismail</a> </p> <p class="card-text"><strong>Abstract:</strong></p> An enormous amount of valuable human knowledge is preserved in documents. The rapid growth in the number of machine-readable documents for public or private access requires the use of automatic text classification. Text classification can be defined as assigning or structuring documents into a defined set of classes known in advance. Arabic text classification methods have emerged as a natural result of the existence of a massive amount of varied textual information written in the Arabic language on the web. This paper presents a review on the published researches of Arabic Text Classification using classical data representation, Bag of words (BoW), and using conceptual data representation based on semantic resources such as Arabic WordNet and Wikipedia. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Arabic%20text%20classification" title="Arabic text classification">Arabic text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic%20WordNet" title=" Arabic WordNet"> Arabic WordNet</a>, <a href="https://publications.waset.org/abstracts/search?q=bag%20of%20words" title=" bag of words"> bag of words</a>, <a href="https://publications.waset.org/abstracts/search?q=conceptual%20representation" title=" conceptual representation"> conceptual representation</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20relations" title=" semantic relations"> semantic relations</a> </p> <a href="https://publications.waset.org/abstracts/42905/arabic-text-classification-review-study" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/42905.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">426</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3349</span> Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Faisal%20Alshuwaier">Faisal Alshuwaier</a>, <a href="https://publications.waset.org/abstracts/search?q=Ali%20Areshey"> Ali Areshey</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound method to simplify the texts. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=extraction" title="extraction">extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=max-prod" title=" max-prod"> max-prod</a>, <a href="https://publications.waset.org/abstracts/search?q=fuzzy%20relations" title=" fuzzy relations"> fuzzy relations</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20mining" title=" text mining"> text mining</a>, <a href="https://publications.waset.org/abstracts/search?q=memberships" title=" memberships"> memberships</a>, <a href="https://publications.waset.org/abstracts/search?q=classification" title=" classification"> classification</a>, <a href="https://publications.waset.org/abstracts/search?q=memberships" title=" memberships"> memberships</a>, <a href="https://publications.waset.org/abstracts/search?q=classification" title=" classification"> classification</a> </p> <a href="https://publications.waset.org/abstracts/23970/optimal-classifying-and-extracting-fuzzy-relationship-from-query-using-text-mining-techniques" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/23970.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">582</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3348</span> A Quantitative Evaluation of Text Feature Selection Methods</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=B.%20S.%20Harish">B. S. Harish</a>, <a href="https://publications.waset.org/abstracts/search?q=M.%20B.%20Revanasiddappa"> M. B. Revanasiddappa</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Due to rapid growth of text documents in digital form, automated text classification has become an important research in the last two decades. The major challenge of text document representations are high dimension, sparsity, volume and semantics. Since the terms are only features that can be found in documents, selection of good terms (features) plays an very important role. In text classification, feature selection is a strategy that can be used to improve classification effectiveness, computational efficiency and accuracy. In this paper, we present a quantitative analysis of most widely used feature selection (FS) methods, viz. Term Frequency-Inverse Document Frequency (tfidf ), Mutual Information (MI), Information Gain (IG), CHISquare (x2), Term Frequency-Relevance Frequency (tfrf ), Term Strength (TS), Ambiguity Measure (AM) and Symbolic Feature Selection (SFS) to classify text documents. We evaluated all the feature selection methods on standard datasets like 20 Newsgroups, 4 University dataset and Reuters-21578. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=classifiers" title="classifiers">classifiers</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20selection" title=" feature selection"> feature selection</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification "> text classification </a> </p> <a href="https://publications.waset.org/abstracts/28926/a-quantitative-evaluation-of-text-feature-selection-methods" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/28926.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">458</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3347</span> Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Bharatendra%20Rai">Bharatendra Rai</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The sequence of words in text data has long-term dependencies and is known to suffer from vanishing gradient problems when developing deep learning models. Although recurrent networks such as long short-term memory networks help to overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine the advantages of long short-term memory networks and convolutional neural networks can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=long%20short-term%20memory%20networks" title="long short-term memory networks">long short-term memory networks</a>, <a href="https://publications.waset.org/abstracts/search?q=convolutional%20recurrent%20networks" title=" convolutional recurrent networks"> convolutional recurrent networks</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=hyperparameter%20tuning" title=" hyperparameter tuning"> hyperparameter tuning</a>, <a href="https://publications.waset.org/abstracts/search?q=Tukey%20honest%20significant%20differences" title=" Tukey honest significant differences"> Tukey honest significant differences</a> </p> <a href="https://publications.waset.org/abstracts/169795/experimental-study-of-hyperparameter-tuning-a-deep-learning-convolutional-recurrent-network-for-text-classification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/169795.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">129</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3346</span> Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Jaspreet%20Singh">Jaspreet Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Gurvinder%20Singh"> Gurvinder Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Prabhsimran%20Singh"> Prabhsimran Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Rajinder%20Singh"> Rajinder Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Prithvipal%20Singh"> Prithvipal Singh</a>, <a href="https://publications.waset.org/abstracts/search?q=Karanjeet%20Singh%20Kahlon"> Karanjeet Singh Kahlon</a>, <a href="https://publications.waset.org/abstracts/search?q=Ravinder%20Singh%20Sawhney"> Ravinder Singh Sawhney</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20neural%20network" title="deep neural network">deep neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=farmer%20suicides" title=" farmer suicides"> farmer suicides</a>, <a href="https://publications.waset.org/abstracts/search?q=morphological%20processing" title=" morphological processing"> morphological processing</a>, <a href="https://publications.waset.org/abstracts/search?q=punjabi%20text" title=" punjabi text"> punjabi text</a>, <a href="https://publications.waset.org/abstracts/search?q=sentiment%20analysis" title=" sentiment analysis"> sentiment analysis</a> </p> <a href="https://publications.waset.org/abstracts/88605/morphological-processing-of-punjabi-text-for-sentiment-analysis-of-farmer-suicides" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/88605.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">326</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3345</span> Multi-Class Text Classification Using Ensembles of Classifiers </h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Syed%20Basit%20Ali%20Shah%20Bukhari">Syed Basit Ali Shah Bukhari</a>, <a href="https://publications.waset.org/abstracts/search?q=Yan%20%20Qiang"> Yan Qiang</a>, <a href="https://publications.waset.org/abstracts/search?q=Saad%20Abdul%20Rauf"> Saad Abdul Rauf</a>, <a href="https://publications.waset.org/abstracts/search?q=Syed%20Saqlaina%20Bukhari"> Syed Saqlaina Bukhari</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Natural%20Language%20Processing" title="Natural Language Processing">Natural Language Processing</a>, <a href="https://publications.waset.org/abstracts/search?q=Ensemble%20Classifier" title=" Ensemble Classifier"> Ensemble Classifier</a>, <a href="https://publications.waset.org/abstracts/search?q=Bagging%20Classifier" title=" Bagging Classifier"> Bagging Classifier</a>, <a href="https://publications.waset.org/abstracts/search?q=AdaBoost" title=" AdaBoost"> AdaBoost</a> </p> <a href="https://publications.waset.org/abstracts/123394/multi-class-text-classification-using-ensembles-of-classifiers" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/123394.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">232</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3344</span> Deep Learning Based-Object-classes Semantic Classification of Arabic Texts</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Imen%20Elleuch">Imen Elleuch</a>, <a href="https://publications.waset.org/abstracts/search?q=Wael%20Ouarda"> Wael Ouarda</a>, <a href="https://publications.waset.org/abstracts/search?q=Gargouri%20Bilel"> Gargouri Bilel</a> </p> <p class="card-text"><strong>Abstract:</strong></p> We proposes in this paper a Deep Learning based approach to classify text in order to enrich an Arabic ontology based on the objects classes of Gaston Gross. Those object classes are defined by taking into account the syntactic and semantic features of the treated language. Thus, our proposed approach is a hybrid one. In fact, it is based on the one hand on the object classes that represents a knowledge based-approach on classification of text and in the other hand it uses the deep learning approach that use the word embedding-based-approach to classify text. We have applied our proposed approach on a corpus constructed from an Arabic dictionary. The obtained semantic classification of text will enrich the Arabic objects classes ontology. In fact, new classes can be added to the ontology or an expansion of the features that characterizes each object class can be updated. The obtained results are compared to a similar work that treats the same object with a classical linguistic approach for the semantic classification of text. This comparison highlight our hybrid proposed approach that can be ameliorated by broaden the dataset used in the deep learning process. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep-learning%20approach" title="deep-learning approach">deep-learning approach</a>, <a href="https://publications.waset.org/abstracts/search?q=object-classes" title=" object-classes"> object-classes</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20classification" title=" semantic classification"> semantic classification</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic" title=" Arabic"> Arabic</a> </p> <a href="https://publications.waset.org/abstracts/176532/deep-learning-based-object-classes-semantic-classification-of-arabic-texts" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/176532.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">88</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3343</span> A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Rim%20Messaoudi">Rim Messaoudi</a>, <a href="https://publications.waset.org/abstracts/search?q=Nogaye-Gueye%20Gning"> Nogaye-Gueye Gning</a>, <a href="https://publications.waset.org/abstracts/search?q=Fran%C3%A7ois%20Azelart"> François Azelart</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title="machine learning">machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=NLP%20techniques" title=" NLP techniques"> NLP techniques</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20representation" title=" semantic representation"> semantic representation</a> </p> <a href="https://publications.waset.org/abstracts/170820/a-text-classification-approach-based-on-natural-language-processing-and-machine-learning-techniques" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/170820.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">100</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3342</span> Incorporating Information Gain in Regular Expressions Based Classifiers</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Rosa%20L.%20Figueroa">Rosa L. Figueroa</a>, <a href="https://publications.waset.org/abstracts/search?q=Christopher%20A.%20Flores"> Christopher A. Flores</a>, <a href="https://publications.waset.org/abstracts/search?q=Qing%20Zeng-Treitler"> Qing Zeng-Treitler</a> </p> <p class="card-text"><strong>Abstract:</strong></p> A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=information%20gain" title="information gain">information gain</a>, <a href="https://publications.waset.org/abstracts/search?q=regular%20expressions" title=" regular expressions"> regular expressions</a>, <a href="https://publications.waset.org/abstracts/search?q=smith-waterman%20algorithm" title=" smith-waterman algorithm"> smith-waterman algorithm</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a> </p> <a href="https://publications.waset.org/abstracts/71695/incorporating-information-gain-in-regular-expressions-based-classifiers" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/71695.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">320</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3341</span> A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=K.%20P.%20Sandesh">K. P. Sandesh</a>, <a href="https://publications.waset.org/abstracts/search?q=M.%20H.%20Suman"> M. H. Suman</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=document%20classification" title="document classification">document classification</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20clustering" title=" document clustering"> document clustering</a>, <a href="https://publications.waset.org/abstracts/search?q=entropy" title=" entropy"> entropy</a>, <a href="https://publications.waset.org/abstracts/search?q=accuracy" title=" accuracy"> accuracy</a>, <a href="https://publications.waset.org/abstracts/search?q=classifiers" title=" classifiers"> classifiers</a>, <a href="https://publications.waset.org/abstracts/search?q=clustering%20algorithms" title=" clustering algorithms"> clustering algorithms</a> </p> <a href="https://publications.waset.org/abstracts/22708/a-similarity-measure-for-classification-and-clustering-in-image-based-medical-and-text-based-banking-applications" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/22708.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">518</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3340</span> On-Road Text Detection Platform for Driver Assistance Systems</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Guezouli%20Larbi">Guezouli Larbi</a>, <a href="https://publications.waset.org/abstracts/search?q=Belkacem%20Soundes"> Belkacem Soundes</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The automation of the text detection process can help the human in his driving task. Its application can be very useful to help drivers to have more information about their environment by facilitating the reading of road signs such as directional signs, events, stores, etc. In this paper, a system consisting of two stages has been proposed. In the first one, we used pseudo-Zernike moments to pinpoint areas of the image that may contain text. The architecture of this part is based on three main steps, region of interest (ROI) detection, text localization, and non-text region filtering. Then, in the second step, we present a convolutional neural network architecture (On-Road Text Detection Network - ORTDN) which is considered a classification phase. The results show that the proposed framework achieved ≈ 35 fps and an mAP of ≈ 90%, thus a low computational time with competitive accuracy. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=text%20detection" title="text detection">text detection</a>, <a href="https://publications.waset.org/abstracts/search?q=CNN" title=" CNN"> CNN</a>, <a href="https://publications.waset.org/abstracts/search?q=PZM" title=" PZM"> PZM</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a> </p> <a href="https://publications.waset.org/abstracts/161507/on-road-text-detection-platform-for-driver-assistance-systems" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/161507.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">83</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3339</span> Development of Fake News Model Using Machine Learning through Natural Language Processing</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Sajjad%20Ahmed">Sajjad Ahmed</a>, <a href="https://publications.waset.org/abstracts/search?q=Knut%20Hinkelmann"> Knut Hinkelmann</a>, <a href="https://publications.waset.org/abstracts/search?q=Flavio%20Corradini"> Flavio Corradini</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Na&iuml;ve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=fake%20news%20detection" title="fake news detection">fake news detection</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=classification%20techniques." title=" classification techniques. "> classification techniques. </a> </p> <a href="https://publications.waset.org/abstracts/127894/development-of-fake-news-model-using-machine-learning-through-natural-language-processing" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/127894.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">167</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3338</span> A Deep Learning Approach to Subsection Identification in Electronic Health Records</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Nitin%20Shravan">Nitin Shravan</a>, <a href="https://publications.waset.org/abstracts/search?q=Sudarsun%20Santhiappan"> Sudarsun Santhiappan</a>, <a href="https://publications.waset.org/abstracts/search?q=B.%20Sivaselvan"> B. Sivaselvan </a> </p> <p class="card-text"><strong>Abstract:</strong></p> Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title="deep learning">deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=semantic%20clinical%20classification" title=" semantic clinical classification"> semantic clinical classification</a>, <a href="https://publications.waset.org/abstracts/search?q=subsection%20identification" title=" subsection identification"> subsection identification</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a> </p> <a href="https://publications.waset.org/abstracts/109176/a-deep-learning-approach-to-subsection-identification-in-electronic-health-records" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/109176.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">217</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3337</span> A New Approach for Improving Accuracy of Multi Label Stream Data</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Kunal%20Shah">Kunal Shah</a>, <a href="https://publications.waset.org/abstracts/search?q=Swati%20Patel"> Swati Patel</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non streaming scenarios. However, learning in evolving streaming scenarios is more challenging, as the learners must be able to adapt to change using limited time and memory. Classification is used to predict class of unseen instance as accurate as possible. Multi label classification is a variant of single label classification where set of labels associated with single instance. Multi label classification is used by modern applications, such as text classification, functional genomics, image classification, music categorization etc. This paper introduces the task of multi-label classification, methods for multi-label classification and evolution measure for multi-label classification. Also, comparative analysis of multi label classification methods on the basis of theoretical study, and then on the basis of simulation was done on various data sets. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=binary%20relevance" title="binary relevance">binary relevance</a>, <a href="https://publications.waset.org/abstracts/search?q=concept%20drift" title=" concept drift"> concept drift</a>, <a href="https://publications.waset.org/abstracts/search?q=data%20stream%20mining" title=" data stream mining"> data stream mining</a>, <a href="https://publications.waset.org/abstracts/search?q=MLSC" title=" MLSC"> MLSC</a>, <a href="https://publications.waset.org/abstracts/search?q=multiple%20window%20with%20buffer" title=" multiple window with buffer"> multiple window with buffer</a> </p> <a href="https://publications.waset.org/abstracts/33035/a-new-approach-for-improving-accuracy-of-multi-label-stream-data" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/33035.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">584</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3336</span> Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=A.%20Elsehemy">A. Elsehemy</a>, <a href="https://publications.waset.org/abstracts/search?q=M.%20Abdeen"> M. Abdeen </a>, <a href="https://publications.waset.org/abstracts/search?q=T.%20Nazmy"> T. Nazmy</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Arabic%20text%20classification" title="Arabic text classification">Arabic text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=ontology%20based%20retrieval" title=" ontology based retrieval"> ontology based retrieval</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic%20semantic%20web" title=" Arabic semantic web"> Arabic semantic web</a>, <a href="https://publications.waset.org/abstracts/search?q=information%20retrieval" title=" information retrieval"> information retrieval</a>, <a href="https://publications.waset.org/abstracts/search?q=Arabic%20ontology" title=" Arabic ontology"> Arabic ontology</a> </p> <a href="https://publications.waset.org/abstracts/34945/enhanced-arabic-semantic-information-retrieval-system-based-on-arabic-text-classification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/34945.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">525</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3335</span> Short Text Classification for Saudi Tweets</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Asma%20A.%20Alsufyani">Asma A. Alsufyani</a>, <a href="https://publications.waset.org/abstracts/search?q=Maram%20A.%20Alharthi"> Maram A. Alharthi</a>, <a href="https://publications.waset.org/abstracts/search?q=Maha%20J.%20Althobaiti"> Maha J. Althobaiti</a>, <a href="https://publications.waset.org/abstracts/search?q=Manal%20S.%20Alharthi"> Manal S. Alharthi</a>, <a href="https://publications.waset.org/abstracts/search?q=Huda%20Rizq"> Huda Rizq</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=corpus%20creation" title="corpus creation">corpus creation</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20extraction" title=" feature extraction"> feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=short%20text%20classification" title=" short text classification"> short text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=social%20media" title=" social media"> social media</a>, <a href="https://publications.waset.org/abstracts/search?q=support%20vector%20machine" title=" support vector machine"> support vector machine</a>, <a href="https://publications.waset.org/abstracts/search?q=Twitter" title=" Twitter"> Twitter</a> </p> <a href="https://publications.waset.org/abstracts/130952/short-text-classification-for-saudi-tweets" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/130952.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">155</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3334</span> Radical Web Text Classification Using a Composite-Based Approach</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Kolade%20Olawande%20Owoeye">Kolade Olawande Owoeye</a>, <a href="https://publications.waset.org/abstracts/search?q=George%20R.%20S.%20Weir"> George R. S. Weir</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The widespread of terrorism and extremism activities on the internet has become a major threat to the government and national securities due to their potential dangers which have necessitated the need for intelligence gathering via web and real-time monitoring of potential websites for extremist activities. However, the manual classification for such contents is practically difficult or time-consuming. In response to this challenge, an automated classification system called composite technique was developed. This is a computational framework that explores the combination of both semantics and syntactic features of textual contents of a web. We implemented the framework on a set of extremist webpages dataset that has been subjected to the manual classification process. Therein, we developed a classification model on the data using J48 decision algorithm, this is to generate a measure of how well each page can be classified into their appropriate classes. The classification result obtained from our method when compared with other states of arts, indicated a 96% success rate in classifying overall webpages when matched against the manual classification. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=extremist" title="extremist">extremist</a>, <a href="https://publications.waset.org/abstracts/search?q=web%20pages" title=" web pages"> web pages</a>, <a href="https://publications.waset.org/abstracts/search?q=classification" title=" classification"> classification</a>, <a href="https://publications.waset.org/abstracts/search?q=semantics" title=" semantics"> semantics</a>, <a href="https://publications.waset.org/abstracts/search?q=posit" title=" posit"> posit</a> </p> <a href="https://publications.waset.org/abstracts/98432/radical-web-text-classification-using-a-composite-based-approach" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/98432.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">145</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3333</span> Amharic Text News Classification Using Supervised Learning </h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Misrak%20Assefa">Misrak Assefa </a> </p> <p class="card-text"><strong>Abstract:</strong></p> The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=text%20categorization" title="text categorization">text categorization</a>, <a href="https://publications.waset.org/abstracts/search?q=supervised%20machine%20learning" title=" supervised machine learning"> supervised machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=naive%20Bayes" title=" naive Bayes"> naive Bayes</a>, <a href="https://publications.waset.org/abstracts/search?q=decision%20tree" title=" decision tree"> decision tree</a> </p> <a href="https://publications.waset.org/abstracts/124249/amharic-text-news-classification-using-supervised-learning" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/124249.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">210</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3332</span> One-Shot Text Classification with Multilingual-BERT</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hsin-Yang%20Wang">Hsin-Yang Wang</a>, <a href="https://publications.waset.org/abstracts/search?q=K.%20M.%20A.%20Salam"> K. M. A. Salam</a>, <a href="https://publications.waset.org/abstracts/search?q=Ying-Jia%20Lin"> Ying-Jia Lin</a>, <a href="https://publications.waset.org/abstracts/search?q=Daniel%20Tan"> Daniel Tan</a>, <a href="https://publications.waset.org/abstracts/search?q=Tzu-Hsuan%20Chou"> Tzu-Hsuan Chou</a>, <a href="https://publications.waset.org/abstracts/search?q=Hung-Yu%20Kao"> Hung-Yu Kao</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Detecting user intent from natural language expression has a wide variety of use cases in different natural language processing applications. Recently few-shot training has a spike of usage on commercial domains. Due to the lack of significant sample features, the downstream task performance has been limited or leads to an unstable result across different domains. As a state-of-the-art method, the pre-trained BERT model gathering the sentence-level information from a large text corpus shows improvement on several NLP benchmarks. In this research, we are proposing a method to change multi-class classification tasks into binary classification tasks, then use the confidence score to rank the results. As a language model, BERT performs well on sequence data. In our experiment, we change the objective from predicting labels into finding the relations between words in sequence data. Our proposed method achieved 71.0% accuracy in the internal intent detection dataset and 63.9% accuracy in the HuffPost dataset. Acknowledgment: This work was supported by NCKU-B109-K003, which is the collaboration between National Cheng Kung University, Taiwan, and SoftBank Corp., Tokyo. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=OSML" title="OSML">OSML</a>, <a href="https://publications.waset.org/abstracts/search?q=BERT" title=" BERT"> BERT</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20classification" title=" text classification"> text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=one%20shot" title=" one shot"> one shot</a> </p> <a href="https://publications.waset.org/abstracts/135007/one-shot-text-classification-with-multilingual-bert" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/135007.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">101</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3331</span> Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=R.%20B.%20Knudsen">R. B. Knudsen</a>, <a href="https://publications.waset.org/abstracts/search?q=O.%20T.%20Rasmussen"> O. T. Rasmussen</a>, <a href="https://publications.waset.org/abstracts/search?q=R.%20A.%20Alphinas"> R. A. Alphinas</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Artificial%20Neural%20network" title="Artificial Neural network">Artificial Neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=Competitive%20dynamics" title=" Competitive dynamics"> Competitive dynamics</a>, <a href="https://publications.waset.org/abstracts/search?q=Logistic%20Regression" title=" Logistic Regression"> Logistic Regression</a>, <a href="https://publications.waset.org/abstracts/search?q=Text%20classification" title=" Text classification"> Text classification</a>, <a href="https://publications.waset.org/abstracts/search?q=Text%20mining" title=" Text mining"> Text mining</a> </p> <a href="https://publications.waset.org/abstracts/127954/developing-an-advanced-algorithm-capable-of-classifying-news-articles-and-other-textual-documents-using-text-mining-techniques" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/127954.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">121</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3330</span> Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Zhaoxin%20Luo">Zhaoxin Luo</a>, <a href="https://publications.waset.org/abstracts/search?q=Michael%20Zhu"> Michael Zhu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=nature%20language%20processing" title="nature language processing">nature language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=recurrent%20neural%20network" title=" recurrent neural network"> recurrent neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=hierarchical%20structure" title=" hierarchical structure"> hierarchical structure</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20classification" title=" document classification"> document classification</a>, <a href="https://publications.waset.org/abstracts/search?q=Chinese" title=" Chinese"> Chinese</a> </p> <a href="https://publications.waset.org/abstracts/171867/recurrent-neural-networks-with-deep-hierarchical-mixed-structures-for-chinese-document-classification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/171867.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">68</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3329</span> Multilabel Classification with Neural Network Ensemble Method</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Sezin%20Ek%C5%9Fio%C4%9Flu">Sezin Ekşioğlu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Multilabel classification has a huge importance for several applications, it is also a challenging research topic. It is a kind of supervised learning that contains binary targets. The distance between multilabel and binary classification is having more than one class in multilabel classification problems. Features can belong to one class or many classes. There exists a wide range of applications for multi label prediction such as image labeling, text categorization, gene functionality. Even though features are classified in many classes, they may not always be properly classified. There are many ensemble methods for the classification. However, most of the researchers have been concerned about better multilabel methods. Especially little ones focus on both efficiency of classifiers and pairwise relationships at the same time in order to implement better multilabel classification. In this paper, we worked on modified ensemble methods by getting benefit from k-Nearest Neighbors and neural network structure to address issues within a beneficial way and to get better impacts from the multilabel classification. Publicly available datasets (yeast, emotion, scene and birds) are performed to demonstrate the developed algorithm efficiency and the technique is measured by accuracy, F1 score and hamming loss metrics. Our algorithm boosts benchmarks for each datasets with different metrics. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=multilabel" title="multilabel">multilabel</a>, <a href="https://publications.waset.org/abstracts/search?q=classification" title=" classification"> classification</a>, <a href="https://publications.waset.org/abstracts/search?q=neural%20network" title=" neural network"> neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=KNN" title=" KNN"> KNN</a> </p> <a href="https://publications.waset.org/abstracts/148169/multilabel-classification-with-neural-network-ensemble-method" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/148169.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">155</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3328</span> Spontaneous Message Detection of Annoying Situation in Community Networks Using Mining Algorithm</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=P.%20Senthil%20Kumari">P. Senthil Kumari</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Main concerns in data mining investigation are social controls of data mining for handling ambiguity, noise, or incompleteness on text data. We describe an innovative approach for unplanned text data detection of community networks achieved by classification mechanism. In a tangible domain claim with humble secrecy backgrounds provided by community network for evading annoying content is presented on consumer message partition. To avoid this, mining methodology provides the capability to unswervingly switch the messages and similarly recover the superiority of ordering. Here we designated learning-centered mining approaches with pre-processing technique to complete this effort. Our involvement of work compact with rule-based personalization for automatic text categorization which was appropriate in many dissimilar frameworks and offers tolerance value for permits the background of comments conferring to a variety of conditions associated with the policy or rule arrangements processed by learning algorithm. Remarkably, we find that the choice of classifier has predicted the class labels for control of the inadequate documents on community network with great value of effect. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=text%20mining" title="text mining">text mining</a>, <a href="https://publications.waset.org/abstracts/search?q=data%20classification" title=" data classification"> data classification</a>, <a href="https://publications.waset.org/abstracts/search?q=community%20network" title=" community network"> community network</a>, <a href="https://publications.waset.org/abstracts/search?q=learning%20algorithm" title=" learning algorithm"> learning algorithm</a> </p> <a href="https://publications.waset.org/abstracts/27184/spontaneous-message-detection-of-annoying-situation-in-community-networks-using-mining-algorithm" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/27184.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">508</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3327</span> Extraction of Text Subtitles in Multimedia Systems</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Amarjit%20Singh">Amarjit Singh</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, a method for extraction of text subtitles in large video is proposed. The video data needs to be annotated for many multimedia applications. Text is incorporated in digital video for the motive of providing useful information about that video. So need arises to detect text present in video to understanding and video indexing. This is achieved in two steps. First step is text localization and the second step is text verification. The method of text detection can be extended to text recognition which finds applications in automatic video indexing; video annotation and content based video retrieval. The method has been tested on various types of videos. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=video" title="video">video</a>, <a href="https://publications.waset.org/abstracts/search?q=subtitles" title=" subtitles"> subtitles</a>, <a href="https://publications.waset.org/abstracts/search?q=extraction" title=" extraction"> extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=annotation" title=" annotation"> annotation</a>, <a href="https://publications.waset.org/abstracts/search?q=frames" title=" frames"> frames</a> </p> <a href="https://publications.waset.org/abstracts/24441/extraction-of-text-subtitles-in-multimedia-systems" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/24441.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">601</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3326</span> Urdu Text Extraction Method from Images</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Samabia%20Tehsin">Samabia Tehsin</a>, <a href="https://publications.waset.org/abstracts/search?q=Sumaira%20Kausar"> Sumaira Kausar</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Due to the vast increase in the multimedia data in recent years, efficient and robust retrieval techniques are needed to retrieve and index images/ videos. Text embedded in the images can serve as the strong retrieval tool for images. This is the reason that text extraction is an area of research with increasing attention. English text extraction is the focus of many researchers but very less work has been done on other languages like Urdu. This paper is focusing on Urdu text extraction from video frames. This paper presents a text detection feature set, which has the ability to deal up with most of the problems connected with the text extraction process. To test the validity of the method, it is tested on Urdu news dataset, which gives promising results. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=caption%20text" title="caption text">caption text</a>, <a href="https://publications.waset.org/abstracts/search?q=content-based%20image%20retrieval" title=" content-based image retrieval"> content-based image retrieval</a>, <a href="https://publications.waset.org/abstracts/search?q=document%20analysis" title=" document analysis"> document analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20extraction" title=" text extraction"> text extraction</a> </p> <a href="https://publications.waset.org/abstracts/9566/urdu-text-extraction-method-from-images" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/9566.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">516</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3325</span> Kannada HandWritten Character Recognition by Edge Hinge and Edge Distribution Techniques Using Manhatan and Minimum Distance Classifiers</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=C.%20V.%20Aravinda">C. V. Aravinda</a>, <a href="https://publications.waset.org/abstracts/search?q=H.%20N.%20Prakash"> H. N. Prakash</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, we tried to convey fusion and state of art pertaining to SIL character recognition systems. In the first step, the text is preprocessed and normalized to perform the text identification correctly. The second step involves extracting relevant and informative features. The third step implements the classification decision. The three stages which involved are Data acquisition and preprocessing, Feature extraction, and Classification. Here we concentrated on two techniques to obtain features, Feature Extraction & Feature Selection. Edge-hinge distribution is a feature that characterizes the changes in direction of a script stroke in handwritten text. The edge-hinge distribution is extracted by means of a windowpane that is slid over an edge-detected binary handwriting image. Whenever the mid pixel of the window is on, the two edge fragments (i.e. connected sequences of pixels) emerging from this mid pixel are measured. Their directions are measured and stored as pairs. A joint probability distribution is obtained from a large sample of such pairs. Despite continuous effort, handwriting identification remains a challenging issue, due to different approaches use different varieties of features, having different. Therefore, our study will focus on handwriting recognition based on feature selection to simplify features extracting task, optimize classification system complexity, reduce running time and improve the classification accuracy. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=word%20segmentation%20and%20recognition" title="word segmentation and recognition">word segmentation and recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=character%20recognition" title=" character recognition"> character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=optical%20character%20recognition" title=" optical character recognition"> optical character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=hand%20written%20character%20recognition" title=" hand written character recognition"> hand written character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=South%20Indian%20languages" title=" South Indian languages"> South Indian languages</a> </p> <a href="https://publications.waset.org/abstracts/41271/kannada-handwritten-character-recognition-by-edge-hinge-and-edge-distribution-techniques-using-manhatan-and-minimum-distance-classifiers" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/41271.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">494</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3324</span> Evaluating Classification with Efficacy Metrics</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Guofan%20Shao">Guofan Shao</a>, <a href="https://publications.waset.org/abstracts/search?q=Lina%20Tang"> Lina Tang</a>, <a href="https://publications.waset.org/abstracts/search?q=Hao%20Zhang"> Hao Zhang</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The values of image classification accuracy are affected by class size distributions and classification schemes, making it difficult to compare the performance of classification algorithms across different remote sensing data sources and classification systems. Based on the term efficacy from medicine and pharmacology, we have developed the metrics of image classification efficacy at the map and class levels. The novelty of this approach is that a baseline classification is involved in computing image classification efficacies so that the effects of class statistics are reduced. Furthermore, the image classification efficacies are interpretable and comparable, and thus, strengthen the assessment of image data classification methods. We use real-world and hypothetical examples to explain the use of image classification efficacies. The metrics of image classification efficacy meet the critical need to rectify the strategy for the assessment of image classification performance as image classification methods are becoming more diversified. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=accuracy%20assessment" title="accuracy assessment">accuracy assessment</a>, <a href="https://publications.waset.org/abstracts/search?q=efficacy" title=" efficacy"> efficacy</a>, <a href="https://publications.waset.org/abstracts/search?q=image%20classification" title=" image classification"> image classification</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=uncertainty" title=" uncertainty"> uncertainty</a> </p> <a href="https://publications.waset.org/abstracts/142555/evaluating-classification-with-efficacy-metrics" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/142555.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">211</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3323</span> Small Text Extraction from Documents and Chart Images</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Rominkumar%20Busa">Rominkumar Busa</a>, <a href="https://publications.waset.org/abstracts/search?q=Shahira%20K.%20C."> Shahira K. C.</a>, <a href="https://publications.waset.org/abstracts/search?q=Lijiya%20A."> Lijiya A.</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=small%20text%20extraction" title="small text extraction">small text extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=OCR" title=" OCR"> OCR</a>, <a href="https://publications.waset.org/abstracts/search?q=scene%20text%20recognition" title=" scene text recognition"> scene text recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=CRNN" title=" CRNN"> CRNN</a> </p> <a href="https://publications.waset.org/abstracts/150310/small-text-extraction-from-documents-and-chart-images" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/150310.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">125</span> </span> </div> </div> <ul class="pagination"> <li class="page-item disabled"><span class="page-link">&lsaquo;</span></li> <li class="page-item active"><span class="page-link">1</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=2">2</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=3">3</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=4">4</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=5">5</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=6">6</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=7">7</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=8">8</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=9">9</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=10">10</a></li> <li class="page-item disabled"><span class="page-link">...</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=111">111</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=112">112</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=text%20classification&amp;page=2" rel="next">&rsaquo;</a></li> </ul> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">&copy; 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10