Search results for: TF-IDF

<!DOCTYPE html> <html lang="en" dir="ltr"> <head>  <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script>  <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript>    <title>Search results for: TF-IDF</title> <meta name="description" content="Search results for: TF-IDF"> <meta name="keywords" content="TF-IDF"> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="TF-IDF" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/abstracts/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value="TF-IDF"> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 4</div> </div> </div> </div> <h1 class="mt-3 mb-3 text-center" style="font-size:1.6rem;">Search results for: TF-IDF</h1> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">4</span> Text Similarity in Vector Space Models: A Comparative Study</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Omid%20Shahmirzadi">Omid Shahmirzadi</a>, <a href="https://publications.waset.org/abstracts/search?q=Adam%20Lugowski"> Adam Lugowski</a>, <a href="https://publications.waset.org/abstracts/search?q=Kenneth%20Younge"> Kenneth Younge</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to-patent similarity and compare TFIDF (and related extensions), topic models (e.g., latent semantic indexing), and neural models (e.g., paragraph vectors). Contrary to expectations, the added computational cost of text embedding methods is justified only when: 1) the target text is condensed; and 2) the similarity comparison is trivial. Otherwise, TFIDF performs surprisingly well in other cases: in particular for longer and more technical texts or for making finer-grained distinctions between nearest neighbors. Unexpectedly, extensions to the TFIDF method, such as adding noun phrases or calculating term weights incrementally, were not helpful in our context. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=big%20data" title="big data">big data</a>, <a href="https://publications.waset.org/abstracts/search?q=patent" title=" patent"> patent</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20embedding" title=" text embedding"> text embedding</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20similarity" title=" text similarity"> text similarity</a>, <a href="https://publications.waset.org/abstracts/search?q=vector%20space%20model" title=" vector space model"> vector space model</a> </p> <a href="https://publications.waset.org/abstracts/102930/text-similarity-in-vector-space-models-a-comparative-study" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/102930.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">175</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3</span> Off-Topic Text Detection System Using a Hybrid Model</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Usama%20Shahid">Usama Shahid</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=off%20topic" title="off topic">off topic</a>, <a href="https://publications.waset.org/abstracts/search?q=text%20detection" title=" text detection"> text detection</a>, <a href="https://publications.waset.org/abstracts/search?q=eco%20state%20network" title=" eco state network"> eco state network</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a> </p> <a href="https://publications.waset.org/abstracts/160685/off-topic-text-detection-system-using-a-hybrid-model" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/160685.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">85</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">2</span> Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Danielle%20Shackley">Danielle Shackley</a>, <a href="https://publications.waset.org/abstracts/search?q=Yetunde%20Folajimi"> Yetunde Folajimi</a> </p> <p class="card-text"><strong>Abstract:</strong></p> As more people turn to the internet seeking health-related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores to text, ranging from positive, neutral, and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing and tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial, and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced, and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process and substituting the Naive Bayes for a deep learning neural network model. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=sentiment%20analysis" title="sentiment analysis">sentiment analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=Naive%20Bayes%20model" title=" Naive Bayes model"> Naive Bayes model</a>, <a href="https://publications.waset.org/abstracts/search?q=natural%20language%20processing" title=" natural language processing"> natural language processing</a>, <a href="https://publications.waset.org/abstracts/search?q=topic%20analysis" title=" topic analysis"> topic analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=fake%20health%20news%20classification%20model" title=" fake health news classification model"> fake health news classification model</a> </p> <a href="https://publications.waset.org/abstracts/153302/sentiment-analysis-of-fake-health-news-using-naive-bayes-classification-models" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/153302.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">97</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1</span> Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Sangita%20Pokhrel">Sangita Pokhrel</a>, <a href="https://publications.waset.org/abstracts/search?q=Nalinda%20Somasiri"> Nalinda Somasiri</a>, <a href="https://publications.waset.org/abstracts/search?q=Rebecca%20Jeyavadhanam"> Rebecca Jeyavadhanam</a>, <a href="https://publications.waset.org/abstracts/search?q=Swathi%20Ganesan"> Swathi Ganesan</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist鈥檚 attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=counter%20vectorization" title="counter vectorization">counter vectorization</a>, <a href="https://publications.waset.org/abstracts/search?q=convolutional%20neural%20network" title=" convolutional neural network"> convolutional neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=crawler" title=" crawler"> crawler</a>, <a href="https://publications.waset.org/abstracts/search?q=data%20technology" title=" data technology"> data technology</a>, <a href="https://publications.waset.org/abstracts/search?q=long%20short-term%20memory" title=" long short-term memory"> long short-term memory</a>, <a href="https://publications.waset.org/abstracts/search?q=web%20scraping" title=" web scraping"> web scraping</a>, <a href="https://publications.waset.org/abstracts/search?q=sentiment%20analysis" title=" sentiment analysis"> sentiment analysis</a> </p> <a href="https://publications.waset.org/abstracts/160194/web-data-scraping-technology-using-term-frequency-inverse-document-frequency-to-enhance-the-big-data-quality-on-sentiment-analysis" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/160194.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">88</span> </span> </div> </div> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">© 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">×</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>

CINXE.COM

Search results for: TF-IDF