CINXE.COM

Search results for: speaker recognition

<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <!-- Google tag (gtag.js) --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script> <!-- Yandex.Metrika counter --> <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript> <!-- /Yandex.Metrika counter --> <!-- Matomo --> <!-- End Matomo Code --> <title>Search results for: speaker recognition</title> <meta name="description" content="Search results for: speaker recognition"> <meta name="keywords" content="speaker recognition"> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="speaker recognition" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/abstracts/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value="speaker recognition"> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 1819</div> </div> </div> </div> <h1 class="mt-3 mb-3 text-center" style="font-size:1.6rem;">Search results for: speaker recognition</h1> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1819</span> Speaker Recognition Using LIRA Neural Networks</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Nestor%20A.%20Garcia%20Fragoso">Nestor A. Garcia Fragoso</a>, <a href="https://publications.waset.org/abstracts/search?q=Tetyana%20Baydyk"> Tetyana Baydyk</a>, <a href="https://publications.waset.org/abstracts/search?q=Ernst%20Kussul"> Ernst Kussul</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker&rsquo;s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=extreme%20learning" title="extreme learning">extreme learning</a>, <a href="https://publications.waset.org/abstracts/search?q=LIRA%20neural%20classifier" title=" LIRA neural classifier"> LIRA neural classifier</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20identification" title=" speaker identification"> speaker identification</a>, <a href="https://publications.waset.org/abstracts/search?q=voice%20recognition" title=" voice recognition"> voice recognition</a> </p> <a href="https://publications.waset.org/abstracts/112384/speaker-recognition-using-lira-neural-networks" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/112384.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">177</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1818</span> Multi-Modal Feature Fusion Network for Speaker Recognition Task</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Xiang%20Shijie">Xiang Shijie</a>, <a href="https://publications.waset.org/abstracts/search?q=Zhou%20Dong"> Zhou Dong</a>, <a href="https://publications.waset.org/abstracts/search?q=Tian%20Dan"> Tian Dan</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Speaker recognition is a crucial task in the field of speech processing, aimed at identifying individuals based on their vocal characteristics. However, existing speaker recognition methods face numerous challenges. Traditional methods primarily rely on audio signals, which often suffer from limitations in noisy environments, variations in speaking style, and insufficient sample sizes. Additionally, relying solely on audio features can sometimes fail to capture the unique identity of the speaker comprehensively, impacting recognition accuracy. To address these issues, we propose a multi-modal network architecture that simultaneously processes both audio and text signals. By gradually integrating audio and text features, we leverage the strengths of both modalities to enhance the robustness and accuracy of speaker recognition. Our experiments demonstrate significant improvements with this multi-modal approach, particularly in complex environments, where recognition performance has been notably enhanced. Our research not only highlights the limitations of current speaker recognition methods but also showcases the effectiveness of multi-modal fusion techniques in overcoming these limitations, providing valuable insights for future research. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=feature%20fusion" title="feature fusion">feature fusion</a>, <a href="https://publications.waset.org/abstracts/search?q=memory%20network" title=" memory network"> memory network</a>, <a href="https://publications.waset.org/abstracts/search?q=multimodal%20input" title=" multimodal input"> multimodal input</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20recognition" title=" speaker recognition"> speaker recognition</a> </p> <a href="https://publications.waset.org/abstracts/191527/multi-modal-feature-fusion-network-for-speaker-recognition-task" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/191527.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">32</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1817</span> Developed Text-Independent Speaker Verification System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Mohammed%20Arif">Mohammed Arif</a>, <a href="https://publications.waset.org/abstracts/search?q=Abdessalam%20Kifouche"> Abdessalam Kifouche</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Speech is a very convenient way of communication between people and machines. It conveys information about the identity of the talker. Since speaker recognition technology is increasingly securing our everyday lives, the objective of this paper is to develop two automatic text-independent speaker verification systems (TI SV) using low-level spectral features and machine learning methods. (i) The first system is based on a support vector machine (SVM), which was widely used in voice signal processing with the aim of speaker recognition involving verifying the identity of the speaker based on its voice characteristics, and (ii) the second is based on Gaussian Mixture Model (GMM) and Universal Background Model (UBM) to combine different functions from different resources to implement the SVM based. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=speaker%20verification" title="speaker verification">speaker verification</a>, <a href="https://publications.waset.org/abstracts/search?q=text-independent" title=" text-independent"> text-independent</a>, <a href="https://publications.waset.org/abstracts/search?q=support%20vector%20machine" title=" support vector machine"> support vector machine</a>, <a href="https://publications.waset.org/abstracts/search?q=Gaussian%20mixture%20model" title=" Gaussian mixture model"> Gaussian mixture model</a>, <a href="https://publications.waset.org/abstracts/search?q=cepstral%20analysis" title=" cepstral analysis"> cepstral analysis</a> </p> <a href="https://publications.waset.org/abstracts/183493/developed-text-independent-speaker-verification-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/183493.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">58</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1816</span> Modified Form of Margin Based Angular Softmax Loss for Speaker Verification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Jamshaid%20ul%20Rahman">Jamshaid ul Rahman</a>, <a href="https://publications.waset.org/abstracts/search?q=Akhter%20Ali"> Akhter Ali</a>, <a href="https://publications.waset.org/abstracts/search?q=Adnan%20Manzoor"> Adnan Manzoor</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Learning-based systems have received increasing interest in recent years; recognition structures, including end-to-end speak recognition, are one of the hot topics in this area. A famous work on end-to-end speaker verification by using Angular Softmax Loss gained significant importance and is considered useful to directly trains a discriminative model instead of the traditional adopted i-vector approach. The margin-based strategy in angular softmax is beneficial to learn discriminative speaker embeddings where the random selection of margin values is a big issue in additive angular margin and multiplicative angular margin. As a better solution in this matter, we present an alternative approach by introducing a bit similar form of an additive parameter that was originally introduced for face recognition, and it has a capacity to adjust automatically with the corresponding margin values and is applicable to learn more discriminative features than the Softmax. Experiments are conducted on the part of Fisher dataset, where it observed that the additive parameter with angular softmax to train the front-end and probabilistic linear discriminant analysis (PLDA) in the back-end boosts the performance of the structure. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=additive%20parameter" title="additive parameter">additive parameter</a>, <a href="https://publications.waset.org/abstracts/search?q=angular%20softmax" title=" angular softmax"> angular softmax</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20verification" title=" speaker verification"> speaker verification</a>, <a href="https://publications.waset.org/abstracts/search?q=PLDA" title=" PLDA"> PLDA</a> </p> <a href="https://publications.waset.org/abstracts/152915/modified-form-of-margin-based-angular-softmax-loss-for-speaker-verification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/152915.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">102</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1815</span> Acoustic Analysis for Comparison and Identification of Normal and Disguised Speech of Individuals</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Surbhi%20Mathur">Surbhi Mathur</a>, <a href="https://publications.waset.org/abstracts/search?q=J.%20M.%20Vyas"> J. M. Vyas</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Although the rapid development of forensic speaker recognition technology has been conducted, there are still many problems to be solved. The biggest problem arises when the cases involving disguised voice samples come across for the purpose of examination and identification. Such type of voice samples of anonymous callers is frequently encountered in crimes involving kidnapping, blackmailing, hoax extortion and many more, where the speaker makes a deliberate effort to manipulate their natural voice in order to conceal their identity due to the fear of being caught. Voice disguise causes serious damage to the natural vocal parameters of the speakers and thus complicates the process of identification. The sole objective of this doctoral project is to find out the possibility of rendering definite opinions in cases involving disguised speech by experimentally determining the effects of different disguise forms on personal identification and percentage rate of speaker recognition for various voice disguise techniques such as raised pitch, lower pitch, increased nasality, covering the mouth, constricting tract, obstacle in mouth etc by analyzing and comparing the amount of phonetic and acoustic variation in of artificial (disguised) and natural sample of an individual, by auditory as well as spectrographic analysis. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=forensic" title="forensic">forensic</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20recognition" title=" speaker recognition"> speaker recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=voice" title=" voice"> voice</a>, <a href="https://publications.waset.org/abstracts/search?q=speech" title=" speech"> speech</a>, <a href="https://publications.waset.org/abstracts/search?q=disguise" title=" disguise"> disguise</a>, <a href="https://publications.waset.org/abstracts/search?q=identification" title=" identification"> identification</a> </p> <a href="https://publications.waset.org/abstracts/47439/acoustic-analysis-for-comparison-and-identification-of-normal-and-disguised-speech-of-individuals" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/47439.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">368</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1814</span> A Two-Step Framework for Unsupervised Speaker Segmentation Using BIC and Artificial Neural Network</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ahmad%20Alwosheel">Ahmad Alwosheel</a>, <a href="https://publications.waset.org/abstracts/search?q=Ahmed%20Alqaraawi"> Ahmed Alqaraawi</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This work proposes a new speaker segmentation approach for two speakers. It is an online approach that does not require a prior information about speaker models. It has two phases, a conventional approach such as unsupervised BIC-based is utilized in the first phase to detect speaker changes and train a Neural Network, while in the second phase, the output trained parameters from the Neural Network are used to predict next incoming audio stream. Using this approach, a comparable accuracy to similar BIC-based approaches is achieved with a significant improvement in terms of computation time. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=artificial%20neural%20network" title="artificial neural network">artificial neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=diarization" title=" diarization"> diarization</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20indexing" title=" speaker indexing"> speaker indexing</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20segmentation" title=" speaker segmentation"> speaker segmentation</a> </p> <a href="https://publications.waset.org/abstracts/27191/a-two-step-framework-for-unsupervised-speaker-segmentation-using-bic-and-artificial-neural-network" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/27191.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">502</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1813</span> Effect of Clinical Depression on Automatic Speaker Verification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Sheeraz%20Memon">Sheeraz Memon</a>, <a href="https://publications.waset.org/abstracts/search?q=Namunu%20C.%20Maddage"> Namunu C. Maddage</a>, <a href="https://publications.waset.org/abstracts/search?q=Margaret%20Lech"> Margaret Lech</a>, <a href="https://publications.waset.org/abstracts/search?q=Nicholas%20Allen"> Nicholas Allen</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The effect of a clinical environment on the accuracy of the speaker verification was tested. The speaker verification tests were performed within homogeneous environments containing clinically depressed speakers only, and non-depresses speakers only, as well as within mixed environments containing different mixtures of both climatically depressed and non-depressed speakers. The speaker verification framework included the MFCCs features and the GMM modeling and classification method. The speaker verification experiments within homogeneous environments showed 5.1% increase of the EER within the clinically depressed environment when compared to the non-depressed environment. It indicated that the clinical depression increases the intra-speaker variability and makes the speaker verification task more challenging. Experiments with mixed environments indicated that the increase of the percentage of the depressed individuals within a mixed environment increases the speaker verification equal error rates. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=speaker%20verification" title="speaker verification">speaker verification</a>, <a href="https://publications.waset.org/abstracts/search?q=GMM" title=" GMM"> GMM</a>, <a href="https://publications.waset.org/abstracts/search?q=EM" title=" EM"> EM</a>, <a href="https://publications.waset.org/abstracts/search?q=clinical%20environment" title=" clinical environment"> clinical environment</a>, <a href="https://publications.waset.org/abstracts/search?q=clinical%20depression" title=" clinical depression"> clinical depression</a> </p> <a href="https://publications.waset.org/abstracts/39436/effect-of-clinical-depression-on-automatic-speaker-verification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/39436.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">375</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1812</span> An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=M.%20Draman">M. Draman</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Z.%20Muhamad%20Yassin"> S. Z. Muhamad Yassin</a>, <a href="https://publications.waset.org/abstracts/search?q=M.%20S.%20Alias"> M. S. Alias</a>, <a href="https://publications.waset.org/abstracts/search?q=Z.%20Lambak"> Z. Lambak</a>, <a href="https://publications.waset.org/abstracts/search?q=M.%20I.%20Zulkifli"> M. I. Zulkifli</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20N.%20Padhi"> S. N. Padhi</a>, <a href="https://publications.waset.org/abstracts/search?q=K.%20N.%20Baharim"> K. N. Baharim</a>, <a href="https://publications.waset.org/abstracts/search?q=F.%20Maskuriy"> F. Maskuriy</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20I.%20A.%20Rahim"> A. I. A. Rahim</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=conversational%20speech%20recognition" title="conversational speech recognition">conversational speech recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20neural%20network" title=" deep neural network"> deep neural network</a>, <a href="https://publications.waset.org/abstracts/search?q=Malay%20language" title=" Malay language"> Malay language</a>, <a href="https://publications.waset.org/abstracts/search?q=speech%20recognition" title=" speech recognition"> speech recognition</a> </p> <a href="https://publications.waset.org/abstracts/93292/an-automatic-speech-recognition-of-conversational-telephone-speech-in-malay-language" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/93292.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">322</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1811</span> Comparative Methods for Speech Enhancement and the Effects on Text-Independent Speaker Identification Performance</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=R.%20Ajgou">R. Ajgou</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Sbaa"> S. Sbaa</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Ghendir"> S. Ghendir</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Chemsa"> A. Chemsa</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Taleb-Ahmed"> A. Taleb-Ahmed</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The speech enhancement algorithm is to improve speech quality. In this paper, we review some speech enhancement methods and we evaluated their performance based on Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862). All method was evaluated in presence of different kind of noise using TIMIT database and NOIZEUS noisy speech corpus.. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. Simulation results showed improved performance of speech enhancement for Tracking of non-stationary noise approach in comparison with various methods in terms of PESQ measure. Moreover, we have evaluated the effects of the speech enhancement technique on Speaker Identification system based on autoregressive (AR) model and Mel-frequency Cepstral coefficients (MFCC). <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=speech%20enhancement" title="speech enhancement">speech enhancement</a>, <a href="https://publications.waset.org/abstracts/search?q=pesq" title=" pesq"> pesq</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20recognition" title=" speaker recognition"> speaker recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a> </p> <a href="https://publications.waset.org/abstracts/31102/comparative-methods-for-speech-enhancement-and-the-effects-on-text-independent-speaker-identification-performance" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/31102.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">424</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1810</span> Unsupervised Reciter Recognition Using Gaussian Mixture Models</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ahmad%20Alwosheel">Ahmad Alwosheel</a>, <a href="https://publications.waset.org/abstracts/search?q=Ahmed%20Alqaraawi"> Ahmed Alqaraawi</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This work proposes an unsupervised text-independent probabilistic approach to recognize Quran reciter voice. It is an accurate approach that works on real time applications. This approach does not require a prior information about reciter models. It has two phases, where in the training phase the reciters' acoustical features are modeled using Gaussian Mixture Models, while in the testing phase, unlabeled reciter's acoustical features are examined among GMM models. Using this approach, a high accuracy results are achieved with efficient computation time process. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Quran" title="Quran">Quran</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20recognition" title=" speaker recognition"> speaker recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=reciter%20recognition" title=" reciter recognition"> reciter recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=Gaussian%20Mixture%20Model" title=" Gaussian Mixture Model"> Gaussian Mixture Model</a> </p> <a href="https://publications.waset.org/abstracts/46532/unsupervised-reciter-recognition-using-gaussian-mixture-models" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/46532.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">380</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1809</span> The Effect of The Speaker&#039;s Speaking Style as A Factor of Understanding and Comfort of The Listener</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Made%20Rahayu%20Putri%20Saron">Made Rahayu Putri Saron</a>, <a href="https://publications.waset.org/abstracts/search?q=Mochamad%20Nizar%20Palefi%20Ma%E2%80%99ady"> Mochamad Nizar Palefi Ma’ady</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Communication skills are important in everyday life, communication can be done verbally in the form of oral or written and nonverbal in the form of expressions or body movements. Good communication should be able to provide information clearly, and there is feedback from the speaker and listener. However, it is often found that the information conveyed is not clear, and there is no feedback from the listeners, so it cannot be ensured that the communication is effective and understandable. The speaker's understanding of the topic is one of the supporting factors for the listener to be able to accept the meaning of the conversation. However, based on the results of the literature review, it found that the influence factors of person speaking style are as follows: (i) environmental conditions; (ii) voice, articulation, and accent; (iii) gender; (iv) personality; (v) speech disorders (Dysarthria); when speaking also have an important influence on speaker’s speaking style. It can be concluded the factors that support understanding and comfort of the listener are dependent on the nature of the speaker (environmental conditions, voice, gender, personality) or also it the speaker have speech disorders. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=listener" title="listener">listener</a>, <a href="https://publications.waset.org/abstracts/search?q=public%20speaking" title=" public speaking"> public speaking</a>, <a href="https://publications.waset.org/abstracts/search?q=speaking%20style" title=" speaking style"> speaking style</a>, <a href="https://publications.waset.org/abstracts/search?q=understanding" title=" understanding"> understanding</a>, <a href="https://publications.waset.org/abstracts/search?q=and%20comfortable%20factor" title=" and comfortable factor"> and comfortable factor</a> </p> <a href="https://publications.waset.org/abstracts/145442/the-effect-of-the-speakers-speaking-style-as-a-factor-of-understanding-and-comfort-of-the-listener" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/145442.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">166</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1808</span> Experimental Study on the Heat Transfer Characteristics of the 200W Class Woofer Speaker</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hyung-Jin%20Kim">Hyung-Jin Kim</a>, <a href="https://publications.waset.org/abstracts/search?q=Dae-Wan%20Kim"> Dae-Wan Kim</a>, <a href="https://publications.waset.org/abstracts/search?q=Moo-Yeon%20Lee"> Moo-Yeon Lee</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The objective of this study is to experimentally investigate the heat transfer characteristics of 200 W class woofer speaker units with the input voice signals. The temperature and heat transfer characteristics of the 200 W class woofer speaker unit were experimentally tested with the several input voice signals such as 1500 Hz, 2500 Hz, and 5000 Hz respectively. From the experiments, it can be observed that the temperature of the woofer speaker unit including the voice-coil part increases with a decrease in input voice signals. Also, the temperature difference in measured points of the voice coil is increased with decrease of the input voice signals. In addition, the heat transfer characteristics of the woofer speaker in case of the input voice signal of 1500 Hz is 40% higher than that of the woofer speaker in case of the input voice signal of 5000 Hz at the measuring time of 200 seconds. It can be concluded from the experiments that initially the temperature of the voice signal increases rapidly with time, after a certain period of time it increases exponentially. Also during this time dependent temperature change, it can be observed that high voice signal is stable than low voice signal. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=heat%20transfer" title="heat transfer">heat transfer</a>, <a href="https://publications.waset.org/abstracts/search?q=temperature" title=" temperature"> temperature</a>, <a href="https://publications.waset.org/abstracts/search?q=voice%20coil" title=" voice coil"> voice coil</a>, <a href="https://publications.waset.org/abstracts/search?q=woofer%20speaker" title=" woofer speaker"> woofer speaker</a> </p> <a href="https://publications.waset.org/abstracts/5142/experimental-study-on-the-heat-transfer-characteristics-of-the-200w-class-woofer-speaker" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/5142.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">360</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1807</span> Using Maximization Entropy in Developing a Filipino Phonetically Balanced Wordlist for a Phoneme-Level Speech Recognition System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=John%20Lorenzo%20Bautista">John Lorenzo Bautista</a>, <a href="https://publications.waset.org/abstracts/search?q=Yoon-Joong%20Kim"> Yoon-Joong Kim</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, a set of Filipino Phonetically Balanced Word list consisting of 250 words (PBW250) were constructed for a phoneme-level ASR system for the Filipino language. The Entropy Maximization is used to obtain phonological balance in the list. Entropy of phonemes in a word is maximized, providing an optimal balance in each word’s phonological distribution using the Add-Delete Method (PBW algorithm) and is compared to the modified PBW algorithm implemented in a dynamic algorithm approach to obtain optimization. The gained entropy score of 4.2791 and 4.2902 for the PBW and modified algorithm respectively. The PBW250 was recorded by 40 respondents, each with 2 sets data. Recordings from 30 respondents were trained to produce an acoustic model that were tested using recordings from 10 respondents using the HMM Toolkit (HTK). The results of test gave the maximum accuracy rate of 97.77% for a speaker dependent test and 89.36% for a speaker independent test. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=entropy%20maximization" title="entropy maximization">entropy maximization</a>, <a href="https://publications.waset.org/abstracts/search?q=Filipino%20language" title=" Filipino language"> Filipino language</a>, <a href="https://publications.waset.org/abstracts/search?q=Hidden%20Markov%20Model" title=" Hidden Markov Model"> Hidden Markov Model</a>, <a href="https://publications.waset.org/abstracts/search?q=phonetically%20balanced%20words" title=" phonetically balanced words"> phonetically balanced words</a>, <a href="https://publications.waset.org/abstracts/search?q=speech%20recognition" title=" speech recognition"> speech recognition</a> </p> <a href="https://publications.waset.org/abstracts/10241/using-maximization-entropy-in-developing-a-filipino-phonetically-balanced-wordlist-for-a-phoneme-level-speech-recognition-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/10241.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">457</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1806</span> USE-Net: SE-Block Enhanced U-Net Architecture for Robust Speaker Identification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Kilari%20Nikhil">Kilari Nikhil</a>, <a href="https://publications.waset.org/abstracts/search?q=Ankur%20Tibrewal"> Ankur Tibrewal</a>, <a href="https://publications.waset.org/abstracts/search?q=Srinivas%20Kruthiventi%20S.%20S."> Srinivas Kruthiventi S. S.</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Conventional speaker identification systems often fall short of capturing the diverse variations present in speech data due to fixed-scale architectures. In this research, we propose a CNN-based architecture, USENet, designed to overcome these limitations. Leveraging two key techniques, our approach achieves superior performance on the VoxCeleb 1 Dataset without any pre-training. Firstly, we adopt a U-net-inspired design to extract features at multiple scales, empowering our model to capture speech characteristics effectively. Secondly, we introduce the squeeze and excitation block to enhance spatial feature learning. The proposed architecture showcases significant advancements in speaker identification, outperforming existing methods, and holds promise for future research in this domain. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=multi-scale%20feature%20extraction" title="multi-scale feature extraction">multi-scale feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=squeeze%20and%20excitation" title=" squeeze and excitation"> squeeze and excitation</a>, <a href="https://publications.waset.org/abstracts/search?q=VoxCeleb1%20speaker%20identification" title=" VoxCeleb1 speaker identification"> VoxCeleb1 speaker identification</a>, <a href="https://publications.waset.org/abstracts/search?q=mel-spectrograms" title=" mel-spectrograms"> mel-spectrograms</a>, <a href="https://publications.waset.org/abstracts/search?q=USENet" title=" USENet"> USENet</a> </p> <a href="https://publications.waset.org/abstracts/170441/use-net-se-block-enhanced-u-net-architecture-for-robust-speaker-identification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/170441.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">74</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1805</span> A Cross-Dialect Statistical Analysis of Final Declarative Intonation in Tuvinian</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=D.%20Beziakina">D. Beziakina</a>, <a href="https://publications.waset.org/abstracts/search?q=E.%20Bulgakova"> E. Bulgakova</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This study continues the research on Tuvinian intonation and presents a general cross-dialect analysis of intonation of Tuvinian declarative utterances, specifically the character of the tone movement in order to test the hypothesis about the prevalence of level tone in some Tuvinian dialects. The results of the analysis of basic pitch characteristics of Tuvinian speech (in general and in comparison with two other Turkic languages - Uzbek and Azerbaijani) are also given in this paper. The goal of our work was to obtain the ranges of pitch parameter values typical for Tuvinian speech. Such language-specific values can be used in speaker identification systems in order to get more accurate results of ethnic speech analysis. We also present the results of a cross-dialect analysis of declarative intonation in the poorly studied Tuvinian language. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=speech%20analysis" title="speech analysis">speech analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=statistical%20analysis" title=" statistical analysis"> statistical analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20recognition" title=" speaker recognition"> speaker recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=identification%20of%20person" title=" identification of person"> identification of person</a> </p> <a href="https://publications.waset.org/abstracts/12497/a-cross-dialect-statistical-analysis-of-final-declarative-intonation-in-tuvinian" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/12497.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">470</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1804</span> Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Tomyslav%20Sledevi%C4%8D">Tomyslav Sledevič</a>, <a href="https://publications.waset.org/abstracts/search?q=Art%C5%ABras%20Serackis"> Artūras Serackis</a>, <a href="https://publications.waset.org/abstracts/search?q=Gintautas%20Tamulevi%C4%8Dius"> Gintautas Tamulevičius</a>, <a href="https://publications.waset.org/abstracts/search?q=Dalius%20Navakauskas"> Dalius Navakauskas</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This paper presents a comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in the speaker-dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signals to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients give best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose, the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfies the real-time requirements and is suitable for applications in embedded systems. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=isolated%20word%20recognition" title="isolated word recognition">isolated word recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=features%20extraction" title=" features extraction"> features extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=LFCC" title=" LFCC"> LFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=LPCC" title=" LPCC"> LPCC</a>, <a href="https://publications.waset.org/abstracts/search?q=LPC" title=" LPC"> LPC</a>, <a href="https://publications.waset.org/abstracts/search?q=FPGA" title=" FPGA"> FPGA</a>, <a href="https://publications.waset.org/abstracts/search?q=DTW" title=" DTW"> DTW</a> </p> <a href="https://publications.waset.org/abstracts/2136/evaluation-of-features-extraction-algorithms-for-a-real-time-isolated-word-recognition-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/2136.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">495</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1803</span> Handwriting Recognition of Gurmukhi Script: A Survey of Online and Offline Techniques</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ravneet%20Kaur">Ravneet Kaur</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Character recognition is a very interesting area of pattern recognition. From past few decades, an intensive research on character recognition for Roman, Chinese, and Japanese and Indian scripts have been reported. In this paper, a review of Handwritten Character Recognition work on Indian Script Gurmukhi is being highlighted. Most of the published papers were summarized, various methodologies were analysed and their results are reported. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Gurmukhi%20character%20recognition" title="Gurmukhi character recognition">Gurmukhi character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=online" title=" online"> online</a>, <a href="https://publications.waset.org/abstracts/search?q=offline" title=" offline"> offline</a>, <a href="https://publications.waset.org/abstracts/search?q=HCR%20survey" title=" HCR survey"> HCR survey</a> </p> <a href="https://publications.waset.org/abstracts/46337/handwriting-recognition-of-gurmukhi-script-a-survey-of-online-and-offline-techniques" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/46337.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">424</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1802</span> OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=A.%20R.%20Bagirzade">A. R. Bagirzade</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Sh.%20Najafova"> A. Sh. Najafova</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20M.%20Yessirkepova"> S. M. Yessirkepova</a>, <a href="https://publications.waset.org/abstracts/search?q=E.%20S.%20Albert"> E. S. Albert</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=ABBYY%20FineReader%20system" title="ABBYY FineReader system">ABBYY FineReader system</a>, <a href="https://publications.waset.org/abstracts/search?q=algorithm%20symbol%20recognition" title=" algorithm symbol recognition"> algorithm symbol recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=OCR%2FICR%20techniques" title=" OCR/ICR techniques"> OCR/ICR techniques</a>, <a href="https://publications.waset.org/abstracts/search?q=recognition%20technologies" title=" recognition technologies"> recognition technologies</a> </p> <a href="https://publications.waset.org/abstracts/130255/ocricr-text-recognition-using-abbyy-finereader-as-an-example-text" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/130255.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">168</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1801</span> An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ben%20Soltane%20Cheima">Ben Soltane Cheima</a>, <a href="https://publications.waset.org/abstracts/search?q=Ittansa%20Yonas%20Kelbesa"> Ittansa Yonas Kelbesa</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=feature%20extraction" title="feature extraction">feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20modeling" title=" speaker modeling"> speaker modeling</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20matching" title=" feature matching"> feature matching</a>, <a href="https://publications.waset.org/abstracts/search?q=Mel%20frequency%20cepstrum%20coefficient%20%28MFCC%29" title=" Mel frequency cepstrum coefficient (MFCC)"> Mel frequency cepstrum coefficient (MFCC)</a>, <a href="https://publications.waset.org/abstracts/search?q=Gaussian%20mixture%20model%20%28GMM%29" title=" Gaussian mixture model (GMM)"> Gaussian mixture model (GMM)</a>, <a href="https://publications.waset.org/abstracts/search?q=vector%20quantization%20%28VQ%29" title=" vector quantization (VQ)"> vector quantization (VQ)</a>, <a href="https://publications.waset.org/abstracts/search?q=Linde-Buzo-Gray%20%28LBG%29" title=" Linde-Buzo-Gray (LBG)"> Linde-Buzo-Gray (LBG)</a>, <a href="https://publications.waset.org/abstracts/search?q=expectation%20maximization%20%28EM%29" title=" expectation maximization (EM)"> expectation maximization (EM)</a>, <a href="https://publications.waset.org/abstracts/search?q=pre-processing" title=" pre-processing"> pre-processing</a>, <a href="https://publications.waset.org/abstracts/search?q=voice%20activity%20detection%20%28VAD%29" title=" voice activity detection (VAD)"> voice activity detection (VAD)</a>, <a href="https://publications.waset.org/abstracts/search?q=short%20time%20energy%20%28STE%29" title=" short time energy (STE)"> short time energy (STE)</a>, <a href="https://publications.waset.org/abstracts/search?q=background%20noise%20statistical%20modeling" title=" background noise statistical modeling"> background noise statistical modeling</a>, <a href="https://publications.waset.org/abstracts/search?q=closed-set%20tex-independent%20speaker%20identification%20system%20%28CISI%29" title=" closed-set tex-independent speaker identification system (CISI)"> closed-set tex-independent speaker identification system (CISI)</a> </p> <a href="https://publications.waset.org/abstracts/16253/an-intelligent-text-independent-speaker-identification-using-vq-gmm-model-based-multiple-classifier-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/16253.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">309</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1800</span> An Improved OCR Algorithm on Appearance Recognition of Electronic Components Based on Self-adaptation of Multifont Template</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Zhu-Qing%20Jia">Zhu-Qing Jia</a>, <a href="https://publications.waset.org/abstracts/search?q=Tao%20Lin"> Tao Lin</a>, <a href="https://publications.waset.org/abstracts/search?q=Tong%20Zhou"> Tong Zhou</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The recognition method of Optical Character Recognition has been expensively utilized, while it is rare to be employed specifically in recognition of electronic components. This paper suggests a high-effective algorithm on appearance identification of integrated circuit components based on the existing methods of character recognition, and analyze the pros and cons. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=optical%20character%20recognition" title="optical character recognition">optical character recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=fuzzy%20page%20identification" title=" fuzzy page identification"> fuzzy page identification</a>, <a href="https://publications.waset.org/abstracts/search?q=mutual%20correlation%20matrix" title=" mutual correlation matrix"> mutual correlation matrix</a>, <a href="https://publications.waset.org/abstracts/search?q=confidence%20self-adaptation" title=" confidence self-adaptation"> confidence self-adaptation</a> </p> <a href="https://publications.waset.org/abstracts/14322/an-improved-ocr-algorithm-on-appearance-recognition-of-electronic-components-based-on-self-adaptation-of-multifont-template" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/14322.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">540</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1799</span> Distant Speech Recognition Using Laser Doppler Vibrometer</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yunbin%20Deng">Yunbin Deng</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=covert%20speech%20acquisition" title="covert speech acquisition">covert speech acquisition</a>, <a href="https://publications.waset.org/abstracts/search?q=distant%20speech%20recognition" title=" distant speech recognition"> distant speech recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=DSR" title=" DSR"> DSR</a>, <a href="https://publications.waset.org/abstracts/search?q=laser%20Doppler%20vibrometer" title=" laser Doppler vibrometer"> laser Doppler vibrometer</a>, <a href="https://publications.waset.org/abstracts/search?q=LDV" title=" LDV"> LDV</a>, <a href="https://publications.waset.org/abstracts/search?q=speech%20intelligence%20surveillance%20and%20reconnaissance" title=" speech intelligence surveillance and reconnaissance"> speech intelligence surveillance and reconnaissance</a>, <a href="https://publications.waset.org/abstracts/search?q=ISR" title=" ISR"> ISR</a> </p> <a href="https://publications.waset.org/abstracts/99091/distant-speech-recognition-using-laser-doppler-vibrometer" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/99091.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">179</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1798</span> English Learning Speech Assistant Speak Application in Artificial Intelligence</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Albatool%20Al%20Abdulwahid">Albatool Al Abdulwahid</a>, <a href="https://publications.waset.org/abstracts/search?q=Bayan%20Shakally"> Bayan Shakally</a>, <a href="https://publications.waset.org/abstracts/search?q=Mariam%20Mohamed"> Mariam Mohamed</a>, <a href="https://publications.waset.org/abstracts/search?q=Wed%20Almokri"> Wed Almokri</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Artificial intelligence has infiltrated every part of our life and every field we can think of. With technical developments, artificial intelligence applications are becoming more prevalent. We chose ELSA speak because it is a magnificent example of Artificial intelligent applications, ELSA speak is a smartphone application that is free to download on both IOS and Android smartphones. ELSA speak utilizes artificial intelligence to help non-native English speakers pronounce words and phrases similar to a native speaker, as well as enhance their English skills. It employs speech-recognition technology that aids the application to excel the pronunciation of its users. This remarkable feature distinguishes ELSA from other voice recognition algorithms and increase the efficiency of the application. This study focused on evaluating ELSA speak application, by testing the degree of effectiveness based on survey questions. The results of the questionnaire were variable. The generality of the participants strongly agreed that ELSA has helped them enhance their pronunciation skills. However, a few participants were unconfident about the application’s ability to assist them in their learning journey. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=ELSA%20speak%20application" title="ELSA speak application">ELSA speak application</a>, <a href="https://publications.waset.org/abstracts/search?q=artificial%20intelligence" title=" artificial intelligence"> artificial intelligence</a>, <a href="https://publications.waset.org/abstracts/search?q=speech-recognition%20technology" title=" speech-recognition technology"> speech-recognition technology</a>, <a href="https://publications.waset.org/abstracts/search?q=language%20learning" title=" language learning"> language learning</a>, <a href="https://publications.waset.org/abstracts/search?q=english%20pronunciation" title=" english pronunciation"> english pronunciation</a> </p> <a href="https://publications.waset.org/abstracts/151244/english-learning-speech-assistant-speak-application-in-artificial-intelligence" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/151244.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">106</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1797</span> Facial Recognition on the Basis of Facial Fragments</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Tetyana%20Baydyk">Tetyana Baydyk</a>, <a href="https://publications.waset.org/abstracts/search?q=Ernst%20Kussul"> Ernst Kussul</a>, <a href="https://publications.waset.org/abstracts/search?q=Sandra%20Bonilla%20Meza"> Sandra Bonilla Meza</a> </p> <p class="card-text"><strong>Abstract:</strong></p> There are many articles that attempt to establish the role of different facial fragments in face recognition. Various approaches are used to estimate this role. Frequently, authors calculate the entropy corresponding to the fragment. This approach can only give approximate estimation. In this paper, we propose to use a more direct measure of the importance of different fragments for face recognition. We propose to select a recognition method and a face database and experimentally investigate the recognition rate using different fragments of faces. We present two such experiments in the paper. We selected the PCNC neural classifier as a method for face recognition and parts of the LFW (Labeled Faces in the Wild<em>) </em>face database as training and testing sets. The recognition rate of the best experiment is comparable with the recognition rate obtained using the whole face. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=face%20recognition" title="face recognition">face recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=labeled%20faces%20in%20the%20wild%20%28LFW%29%20database" title=" labeled faces in the wild (LFW) database"> labeled faces in the wild (LFW) database</a>, <a href="https://publications.waset.org/abstracts/search?q=random%20local%20descriptor%20%28RLD%29" title=" random local descriptor (RLD)"> random local descriptor (RLD)</a>, <a href="https://publications.waset.org/abstracts/search?q=random%20features" title=" random features"> random features</a> </p> <a href="https://publications.waset.org/abstracts/50117/facial-recognition-on-the-basis-of-facial-fragments" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/50117.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">360</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1796</span> The Difference of Learning Outcomes in Reading Comprehension between Text and Film as The Media in Indonesian Language for Foreign Speaker in Intermediate Level</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Siti%20Ayu%20Ningsih">Siti Ayu Ningsih</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This study aims to find the differences outcomes in learning reading comprehension with text and film as media on Indonesian Language for foreign speaker (BIPA) learning at intermediate level. By using quantitative and qualitative research methods, the respondent of this study is a single respondent from D'Royal Morocco Integrative Islamic School in grade nine from secondary level. Quantitative method used to calculate the learning outcomes that have been given the appropriate action cycle, whereas qualitative method used to translate the findings derived from quantitative methods to be described. The technique used in this study is the observation techniques and testing work. Based on the research, it is known that the use of the text media is more effective than the film for intermediate level of Indonesian Language for foreign speaker learner. This is because, when using film the learner does not have enough time to take note the difficult vocabulary and don't have enough time to look for the meaning of the vocabulary from the dictionary. While the use of media texts shows the better effectiveness because it does not require additional time to take note the difficult words. For the words that are difficult or strange, the learner can immediately find its meaning from the dictionary. The presence of the text is also very helpful for Indonesian Language for foreign speaker learner to find the answers according to the questions more easily. By matching the vocabulary of the question into the text references. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Indonesian%20language%20for%20foreign%20speaker" title="Indonesian language for foreign speaker">Indonesian language for foreign speaker</a>, <a href="https://publications.waset.org/abstracts/search?q=learning%20outcome" title=" learning outcome"> learning outcome</a>, <a href="https://publications.waset.org/abstracts/search?q=media" title=" media"> media</a>, <a href="https://publications.waset.org/abstracts/search?q=reading%20comprehension" title=" reading comprehension"> reading comprehension</a> </p> <a href="https://publications.waset.org/abstracts/82676/the-difference-of-learning-outcomes-in-reading-comprehension-between-text-and-film-as-the-media-in-indonesian-language-for-foreign-speaker-in-intermediate-level" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/82676.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">197</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1795</span> DBN-Based Face Recognition System Using Light Field</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Bing%20Gu">Bing Gu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Abstract—Most of Conventional facial recognition systems are based on image features, such as LBP, SIFT. Recently some DBN-based 2D facial recognition systems have been proposed. However, we find there are few DBN-based 3D facial recognition system and relative researches. 3D facial images include all the individual biometric information. We can use these information to build more accurate features, So we present our DBN-based face recognition system using Light Field. We can see Light Field as another presentation of 3D image, and Light Field Camera show us a way to receive a Light Field. We use the commercially available Light Field Camera to act as the collector of our face recognition system, and the system receive a state-of-art performance as convenient as conventional 2D face recognition system. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=DBN" title="DBN">DBN</a>, <a href="https://publications.waset.org/abstracts/search?q=face%20recognition" title=" face recognition"> face recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=light%20field" title=" light field"> light field</a>, <a href="https://publications.waset.org/abstracts/search?q=Lytro" title=" Lytro"> Lytro</a> </p> <a href="https://publications.waset.org/abstracts/10821/dbn-based-face-recognition-system-using-light-field" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/10821.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">464</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1794</span> Studying Second Language Learners&#039; Language Behavior from Conversation Analysis Perspective</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yanyan%20Wang">Yanyan Wang</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This paper on second language teaching and learning uses conversation analysis (CA) approach and focuses on how second language learners of Chinese do repair when making clarification requests. In order to demonstrate their behavior in interaction, a comparison was made to study the differences between native speakers of Chinese with non-native speakers of Chinese. The significance of the research is to make second language teachers and learners aware of repair and how to seek clarification. Utilizing the methodology of CA, the research involved two sets of naturally occurring recordings, one of native speaker students and the other of non-native speaker students. Both sets of recording were telephone talks between students and teachers. There were 50 native speaker students and 50 non-native speaker students. From multiple listening to the recordings, the parts with repairs for clarification were selected for analysis which included the moments in the talk when students had problems in understanding or hearing the speaker and had to seek clarification. For example, ‘Sorry, I do not understand ‘and ‘Can you repeat the question? ‘were the parts as repair to make clarification requests. In the data, there were 43 such cases from native speaker students and 88 cases from non-native speaker students. The non-native speaker students were more likely to use repair to seek clarification. Analysis on how the students make clarification requests during their conversation was carried out by investigating how the students initiated problems and how the teachers repaired the problems. In CA term, it is called other-initiated self-repair (OISR), which refers to student-initiated teacher-repair in this research. The findings show that, in initiating repair, native speaker students pay more attention to mutual understanding (inter-subjectivity) while non-native speaker students, due to their lack of language proficiency, pay more attention to their status of knowledge (epistemic) switch. There are three major differences: 1, native Chinese students more often initiate closed-class OISR (seeking specific information in the request) such as repeating a word or phrases from the previous turn while non-native students more frequently initiate open-class OISR (not specifying clarification) such as ‘sorry, I don’t understand ‘. 2, native speakers’ clarification requests are treated by the teacher as understanding of the content while non-native learners’ clarification requests are treated by teacher as language proficiency problem. 3, native speakers don’t see repair as knowledge issue and there is no third position in the repair sequences to close repair while non-native learners take repair sequence as a time to adjust their knowledge. There is clear closing third position token such as ‘oh ‘ to close repair sequence so that the topic can go back. In conclusion, this paper uses conversation analysis approach to compare differences between native Chinese speakers and non-native Chinese learners in their ways of conducting repair when making clarification requests. The findings are useful in future Chinese language teaching and learning, especially in teaching pragmatics such as requests. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=conversation%20analysis%20%28CA%29" title="conversation analysis (CA)">conversation analysis (CA)</a>, <a href="https://publications.waset.org/abstracts/search?q=clarification%20request" title=" clarification request"> clarification request</a>, <a href="https://publications.waset.org/abstracts/search?q=second%20language%20%28L2%29" title=" second language (L2)"> second language (L2)</a>, <a href="https://publications.waset.org/abstracts/search?q=teaching%20implication" title=" teaching implication"> teaching implication</a> </p> <a href="https://publications.waset.org/abstracts/74368/studying-second-language-learners-language-behavior-from-conversation-analysis-perspective" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/74368.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">256</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1793</span> The Effect of Iconic and Beat Gestures on Memory Recall in Greek’s First and Second Language</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Eleni%20Ioanna%20Levantinou">Eleni Ioanna Levantinou</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Gestures play a major role in comprehension and memory recall due to the fact that aid the efficient channel of the meaning and support listeners&rsquo; comprehension and memory. In the present study, the assistance of two kinds of gestures (iconic and beat gestures) is tested in regards to memory and recall. The hypothesis investigated here is whether or not iconic and beat gestures provide assistance in memory and recall in Greek and in Greek speakers&rsquo; second language. Two groups of participants were formed, one comprising Greeks that reside in Athens and one with Greeks that reside in Copenhagen. Three kinds of stimuli were used: A video with words accompanied with iconic gestures, a video with words accompanied with beat gestures and a video with words alone. The languages used are Greek and English. The words in the English videos were spoken by a native English speaker and by a Greek speaker talking English. The reason for this is that when it comes to beat gestures that serve a meta-cognitive function and are generated according to the intonation of a language, prosody plays a major role. Thus, participants that have different influences in prosody may generate different results from rhythmic gestures. Memory recall was assessed by asking the participants to try to remember as many words as they could after viewing each video. Results show that iconic gestures provide significant assistance in memory and recall in Greek and in English whether they are produced by a native or a second language speaker. In the case of beat gestures though, the findings indicate that beat gestures may not play such a significant role in Greek language. As far as intonation is concerned, a significant difference was not found in the case of beat gestures produced by a native English speaker and by a Greek speaker talking English. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=first%20language" title="first language">first language</a>, <a href="https://publications.waset.org/abstracts/search?q=gestures" title=" gestures"> gestures</a>, <a href="https://publications.waset.org/abstracts/search?q=memory" title=" memory"> memory</a>, <a href="https://publications.waset.org/abstracts/search?q=second%20language%20acquisition" title=" second language acquisition"> second language acquisition</a> </p> <a href="https://publications.waset.org/abstracts/49317/the-effect-of-iconic-and-beat-gestures-on-memory-recall-in-greeks-first-and-second-language" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/49317.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">333</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1792</span> Face Tracking and Recognition Using Deep Learning Approach</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Degale%20Desta">Degale Desta</a>, <a href="https://publications.waset.org/abstracts/search?q=Cheng%20Jian"> Cheng Jian</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title="deep learning">deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=face%20recognition" title=" face recognition"> face recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=identification" title=" identification"> identification</a>, <a href="https://publications.waset.org/abstracts/search?q=fast-RCNN" title=" fast-RCNN"> fast-RCNN</a> </p> <a href="https://publications.waset.org/abstracts/163134/face-tracking-and-recognition-using-deep-learning-approach" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/163134.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">140</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1791</span> Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Vesna%20Kirandziska">Vesna Kirandziska</a>, <a href="https://publications.waset.org/abstracts/search?q=Nevena%20Ackovska"> Nevena Ackovska</a>, <a href="https://publications.waset.org/abstracts/search?q=Ana%20Madevska%20Bogdanova"> Ana Madevska Bogdanova</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=emotion%20recognition" title="emotion recognition">emotion recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=facial%20recognition" title=" facial recognition"> facial recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=signal%20processing" title=" signal processing"> signal processing</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a> </p> <a href="https://publications.waset.org/abstracts/42384/comparing-emotion-recognition-from-voice-and-facial-data-using-time-invariant-features" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/42384.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">315</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1790</span> Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Van%20Nhan%20Nguyen">Van Nhan Nguyen</a>, <a href="https://publications.waset.org/abstracts/search?q=Harald%20Holone"> Harald Holone</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Traffic Control (ATC), such as air traffic control simulation and training, monitoring live operators for with the aim of safety improvements, air traffic controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this field. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air traffic control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as specific approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=automatic%20speech%20recognition" title="automatic speech recognition">automatic speech recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=asr" title=" asr"> asr</a>, <a href="https://publications.waset.org/abstracts/search?q=air%20traffic%20control" title=" air traffic control"> air traffic control</a>, <a href="https://publications.waset.org/abstracts/search?q=atc" title=" atc"> atc</a> </p> <a href="https://publications.waset.org/abstracts/31004/possibilities-challenges-and-the-state-of-the-art-of-automatic-speech-recognition-in-air-traffic-control" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/31004.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">399</span> </span> </div> </div> <ul class="pagination"> <li class="page-item disabled"><span class="page-link">&lsaquo;</span></li> <li class="page-item active"><span class="page-link">1</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=2">2</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=3">3</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=4">4</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=5">5</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=6">6</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=7">7</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=8">8</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=9">9</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=10">10</a></li> <li class="page-item disabled"><span class="page-link">...</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=60">60</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=61">61</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=speaker%20recognition&amp;page=2" rel="next">&rsaquo;</a></li> </ul> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">&copy; 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10