CINXE.COM
Search results for: MFCC
<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <!-- Google tag (gtag.js) --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script> <!-- Yandex.Metrika counter --> <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript> <!-- /Yandex.Metrika counter --> <!-- Matomo --> <!-- End Matomo Code --> <title>Search results for: MFCC</title> <meta name="description" content="Search results for: MFCC"> <meta name="keywords" content="MFCC"> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="MFCC" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/abstracts/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value="MFCC"> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 24</div> </div> </div> </div> <h1 class="mt-3 mb-3 text-center" style="font-size:1.6rem;">Search results for: MFCC</h1> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">24</span> Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hajer%20Rahali">Hajer Rahali</a>, <a href="https://publications.waset.org/abstracts/search?q=Zied%20Hajaiej"> Zied Hajaiej</a>, <a href="https://publications.waset.org/abstracts/search?q=Noureddine%20Ellouze"> Noureddine Ellouze</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=auditory%20filter" title="auditory filter">auditory filter</a>, <a href="https://publications.waset.org/abstracts/search?q=impulsive%20noise" title=" impulsive noise"> impulsive noise</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=prosodic%20features" title=" prosodic features"> prosodic features</a>, <a href="https://publications.waset.org/abstracts/search?q=RASTA%20filter" title=" RASTA filter"> RASTA filter</a> </p> <a href="https://publications.waset.org/abstracts/8911/robust-features-for-impulsive-noisy-speech-recognition-using-relative-spectral-analysis" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/8911.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">425</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">23</span> Feature Extraction of MFCC Based on Fisher-Ratio and Correlated Distance Criterion for Underwater Target Signal</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Han%20Xue">Han Xue</a>, <a href="https://publications.waset.org/abstracts/search?q=Zhang%20Lanyue"> Zhang Lanyue</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In order to seek more effective feature extraction technology, feature extraction method based on MFCC combined with vector hydrophone is exposed in the paper. The sound pressure signal and particle velocity signal of two kinds of ships are extracted by using MFCC and its evolution form, and the extracted features are fused by using fisher-ratio and correlated distance criterion. The features are then identified by BP neural network. The results showed that MFCC, First-Order Differential MFCC and Second-Order Differential MFCC features can be used as effective features for recognition of underwater targets, and the fusion feature can improve the recognition rate. Moreover, the results also showed that the recognition rate of the particle velocity signal is higher than that of the sound pressure signal, and it reflects the superiority of vector signal processing. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=vector%20information" title="vector information">vector information</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=differential%20MFCC" title=" differential MFCC"> differential MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=fusion%20feature" title=" fusion feature"> fusion feature</a>, <a href="https://publications.waset.org/abstracts/search?q=BP%20neural%20network" title=" BP neural network "> BP neural network </a> </p> <a href="https://publications.waset.org/abstracts/33608/feature-extraction-of-mfcc-based-on-fisher-ratio-and-correlated-distance-criterion-for-underwater-target-signal" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/33608.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">530</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">22</span> Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ahmed%20Kamil%20Hasan%20Al-Ali">Ahmed Kamil Hasan Al-Ali</a>, <a href="https://publications.waset.org/abstracts/search?q=Bouchra%20Senadji"> Bouchra Senadji</a>, <a href="https://publications.waset.org/abstracts/search?q=Ganesh%20Naik"> Ganesh Naik</a> </p> <p class="card-text"><strong>Abstract:</strong></p> We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=noisy%20forensic%20speaker%20verification" title="noisy forensic speaker verification">noisy forensic speaker verification</a>, <a href="https://publications.waset.org/abstracts/search?q=ICA%20algorithm" title=" ICA algorithm"> ICA algorithm</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC%20feature%20warping" title=" MFCC feature warping"> MFCC feature warping</a> </p> <a href="https://publications.waset.org/abstracts/66332/forensic-speaker-verification-in-noisy-environmental-by-enhancing-the-speech-signal-using-ica-approach" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/66332.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">408</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">21</span> The Capacity of Mel Frequency Cepstral Coefficients for Speech Recognition</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Fawaz%20S.%20Al-Anzi">Fawaz S. Al-Anzi</a>, <a href="https://publications.waset.org/abstracts/search?q=Dia%20AbuZeina"> Dia AbuZeina</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Speech recognition is of an important contribution in promoting new technologies in human computer interaction. Today, there is a growing need to employ speech technology in daily life and business activities. However, speech recognition is a challenging task that requires different stages before obtaining the desired output. Among automatic speech recognition (ASR) components is the feature extraction process, which parameterizes the speech signal to produce the corresponding feature vectors. Feature extraction process aims at approximating the linguistic content that is conveyed by the input speech signal. In speech processing field, there are several methods to extract speech features, however, Mel Frequency Cepstral Coefficients (MFCC) is the popular technique. It has been long observed that the MFCC is dominantly used in the well-known recognizers such as the Carnegie Mellon University (CMU) Sphinx and the Markov Model Toolkit (HTK). Hence, this paper focuses on the MFCC method as the standard choice to identify the different speech segments in order to obtain the language phonemes for further training and decoding steps. Due to MFCC good performance, the previous studies show that the MFCC dominates the Arabic ASR research. In this paper, we demonstrate MFCC as well as the intermediate steps that are performed to get these coefficients using the HTK toolkit. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=speech%20recognition" title="speech recognition">speech recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=acoustic%20features" title=" acoustic features"> acoustic features</a>, <a href="https://publications.waset.org/abstracts/search?q=mel%20frequency" title=" mel frequency"> mel frequency</a>, <a href="https://publications.waset.org/abstracts/search?q=cepstral%20coefficients" title=" cepstral coefficients"> cepstral coefficients</a> </p> <a href="https://publications.waset.org/abstracts/78382/the-capacity-of-mel-frequency-cepstral-coefficients-for-speech-recognition" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/78382.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">259</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">20</span> The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Brahim-Fares%20Zaidi">Brahim-Fares Zaidi</a>, <a href="https://publications.waset.org/abstracts/search?q=Malika%20Boudraa"> Malika Boudraa</a>, <a href="https://publications.waset.org/abstracts/search?q=Sid-Ahmed%20Selouani"> Sid-Ahmed Selouani</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=hidden%20Markov%20model%20toolkit%20%28HTK%29" title="hidden Markov model toolkit (HTK)">hidden Markov model toolkit (HTK)</a>, <a href="https://publications.waset.org/abstracts/search?q=hidden%20models%20of%20Markov%20%28HMM%29" title=" hidden models of Markov (HMM)"> hidden models of Markov (HMM)</a>, <a href="https://publications.waset.org/abstracts/search?q=Mel-frequency%20cepstral%20coefficients%20%28MFCC%29" title=" Mel-frequency cepstral coefficients (MFCC)"> Mel-frequency cepstral coefficients (MFCC)</a>, <a href="https://publications.waset.org/abstracts/search?q=perceptual%20linear%20prediction%20%28PLP%E2%80%99s%29" title=" perceptual linear prediction (PLP’s)"> perceptual linear prediction (PLP’s)</a> </p> <a href="https://publications.waset.org/abstracts/143303/the-combination-of-the-mel-frequency-cepstral-coefficients-mfcc-perceptual-linear-prediction-plp-jitter-and-shimmer-coefficients-for-the-improvement-of-automatic-recognition-system-for-dysarthric-speech" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/143303.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">161</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">19</span> Comparative Methods for Speech Enhancement and the Effects on Text-Independent Speaker Identification Performance</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=R.%20Ajgou">R. Ajgou</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Sbaa"> S. Sbaa</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Ghendir"> S. Ghendir</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Chemsa"> A. Chemsa</a>, <a href="https://publications.waset.org/abstracts/search?q=A.%20Taleb-Ahmed"> A. Taleb-Ahmed</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The speech enhancement algorithm is to improve speech quality. In this paper, we review some speech enhancement methods and we evaluated their performance based on Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862). All method was evaluated in presence of different kind of noise using TIMIT database and NOIZEUS noisy speech corpus.. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. Simulation results showed improved performance of speech enhancement for Tracking of non-stationary noise approach in comparison with various methods in terms of PESQ measure. Moreover, we have evaluated the effects of the speech enhancement technique on Speaker Identification system based on autoregressive (AR) model and Mel-frequency Cepstral coefficients (MFCC). <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=speech%20enhancement" title="speech enhancement">speech enhancement</a>, <a href="https://publications.waset.org/abstracts/search?q=pesq" title=" pesq"> pesq</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20recognition" title=" speaker recognition"> speaker recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a> </p> <a href="https://publications.waset.org/abstracts/31102/comparative-methods-for-speech-enhancement-and-the-effects-on-text-independent-speaker-identification-performance" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/31102.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">424</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">18</span> Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Michal%20Krecichwost">Michal Krecichwost</a>, <a href="https://publications.waset.org/abstracts/search?q=Zauzanna%20Miodonska"> Zauzanna Miodonska</a>, <a href="https://publications.waset.org/abstracts/search?q=Pawel%20Badura"> Pawel Badura</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=computer-aided%20pronunciation%20evaluation" title="computer-aided pronunciation evaluation">computer-aided pronunciation evaluation</a>, <a href="https://publications.waset.org/abstracts/search?q=sibilants" title=" sibilants"> sibilants</a>, <a href="https://publications.waset.org/abstracts/search?q=sigmatism%20diagnosis" title=" sigmatism diagnosis"> sigmatism diagnosis</a>, <a href="https://publications.waset.org/abstracts/search?q=speech%20processing" title=" speech processing"> speech processing</a> </p> <a href="https://publications.waset.org/abstracts/65531/detection-of-phoneme-s-mispronounciation-for-sigmatism-diagnosis-in-adults" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/65531.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">283</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">17</span> Sound Analysis of Young Broilers Reared under Different Stocking Densities in Intensive Poultry Farming</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Xiaoyang%20Zhao">Xiaoyang Zhao</a>, <a href="https://publications.waset.org/abstracts/search?q=Kaiying%20Wang"> Kaiying Wang</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The choice of stocking density in poultry farming is a potential way for determining welfare level of poultry. However, it is difficult to measure stocking densities in poultry farming because of a lot of variables such as species, age and weight, feeding way, house structure and geographical location in different broiler houses. A method was proposed in this paper to measure the differences of young broilers reared under different stocking densities by sound analysis. Vocalisations of broilers were recorded and analysed under different stocking densities to identify the relationship between sounds and stocking densities. Recordings were made continuously for three-week-old chickens in order to evaluate the variation of sounds emitted by the animals at the beginning. The experimental trial was carried out in an indoor reared broiler farm; the audio recording procedures lasted for 5 days. Broilers were divided into 5 groups, stocking density treatments were 8/m², 10/m², 12/m² (96birds/pen), 14/m² and 16/m², all conditions including ventilation and feed conditions were kept same except from stocking densities in every group. The recordings and analysis of sounds of chickens were made noninvasively. Sound recordings were manually analysed and labelled using sound analysis software: GoldWave Digital Audio Editor. After sound acquisition process, the Mel Frequency Cepstrum Coefficients (MFCC) was extracted from sound data, and the Support Vector Machine (SVM) was used as an early detector and classifier. This preliminary study, conducted in an indoor reared broiler farm shows that this method can be used to classify sounds of chickens under different densities economically (only a cheap microphone and recorder can be used), the classification accuracy is 85.7%. This method can predict the optimum stocking density of broilers with the complement of animal welfare indicators, animal productive indicators and so on. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=broiler" title="broiler">broiler</a>, <a href="https://publications.waset.org/abstracts/search?q=stocking%20density" title=" stocking density"> stocking density</a>, <a href="https://publications.waset.org/abstracts/search?q=poultry%20farming" title=" poultry farming"> poultry farming</a>, <a href="https://publications.waset.org/abstracts/search?q=sound%20monitoring" title=" sound monitoring"> sound monitoring</a>, <a href="https://publications.waset.org/abstracts/search?q=Mel%20Frequency%20Cepstrum%20Coefficients%20%28MFCC%29" title=" Mel Frequency Cepstrum Coefficients (MFCC)"> Mel Frequency Cepstrum Coefficients (MFCC)</a>, <a href="https://publications.waset.org/abstracts/search?q=Support%20Vector%20Machine%20%28SVM%29" title=" Support Vector Machine (SVM)"> Support Vector Machine (SVM)</a> </p> <a href="https://publications.waset.org/abstracts/91996/sound-analysis-of-young-broilers-reared-under-different-stocking-densities-in-intensive-poultry-farming" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/91996.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">162</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">16</span> Intrinsic Motivational Factor of Students in Learning Mathematics and Science Based on Electroencephalogram Signals</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Norzaliza%20Md.%20Nor">Norzaliza Md. Nor</a>, <a href="https://publications.waset.org/abstracts/search?q=Sh-Hussain%20Salleh"> Sh-Hussain Salleh</a>, <a href="https://publications.waset.org/abstracts/search?q=Mahyar%20Hamedi"> Mahyar Hamedi</a>, <a href="https://publications.waset.org/abstracts/search?q=Hadrina%20Hussain"> Hadrina Hussain</a>, <a href="https://publications.waset.org/abstracts/search?q=Wahab%20Abdul%20Rahman"> Wahab Abdul Rahman</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Motivational factor is mainly the students’ desire to involve in learning process. However, it also depends on the goal towards their involvement or non-involvement in academic activity. Even though, the students’ motivation might be in the same level, but the basis of their motivation may differ. In this study, it focuses on the intrinsic motivational factor which student enjoy learning or feeling of accomplishment the activity or study for its own sake. The intrinsic motivational factor of students in learning mathematics and science has found as difficult to be achieved because it depends on students’ interest. In the Program for International Student Assessment (PISA) for mathematics and science, Malaysia is ranked as third lowest. The main problem in Malaysian educational system, students tend to have extrinsic motivation which they have to score in exam in order to achieve a good result and enrolled as university students. The use of electroencephalogram (EEG) signals has found to be scarce especially to identify the students’ intrinsic motivational factor in learning science and mathematics. In this research study, we are identifying the correlation between precursor emotion and its dynamic emotion to verify the intrinsic motivational factor of students in learning mathematics and science. The 2-D Affective Space Model (ASM) was used in this research in order to identify the relationship of precursor emotion and its dynamic emotion based on the four basic emotions, happy, calm, fear and sad. These four basic emotions are required to be used as reference stimuli. Then, in order to capture the brain waves, EEG device was used, while Mel Frequency Cepstral Coefficient (MFCC) was adopted to be used for extracting the features before it will be feed to Multilayer Perceptron (MLP) to classify the valence and arousal axes for the ASM. The results show that the precursor emotion had an influence the dynamic emotions and it identifies that most students have no interest in mathematics and science according to the negative emotion (sad and fear) appear in the EEG signals. We hope that these results can help us further relate the behavior and intrinsic motivational factor of students towards learning of mathematics and science. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=EEG" title="EEG">EEG</a>, <a href="https://publications.waset.org/abstracts/search?q=MLP" title=" MLP"> MLP</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=intrinsic%20motivational%20factor" title=" intrinsic motivational factor"> intrinsic motivational factor</a> </p> <a href="https://publications.waset.org/abstracts/52426/intrinsic-motivational-factor-of-students-in-learning-mathematics-and-science-based-on-electroencephalogram-signals" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/52426.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">366</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">15</span> An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Ben%20Soltane%20Cheima">Ben Soltane Cheima</a>, <a href="https://publications.waset.org/abstracts/search?q=Ittansa%20Yonas%20Kelbesa"> Ittansa Yonas Kelbesa</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=feature%20extraction" title="feature extraction">feature extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=speaker%20modeling" title=" speaker modeling"> speaker modeling</a>, <a href="https://publications.waset.org/abstracts/search?q=feature%20matching" title=" feature matching"> feature matching</a>, <a href="https://publications.waset.org/abstracts/search?q=Mel%20frequency%20cepstrum%20coefficient%20%28MFCC%29" title=" Mel frequency cepstrum coefficient (MFCC)"> Mel frequency cepstrum coefficient (MFCC)</a>, <a href="https://publications.waset.org/abstracts/search?q=Gaussian%20mixture%20model%20%28GMM%29" title=" Gaussian mixture model (GMM)"> Gaussian mixture model (GMM)</a>, <a href="https://publications.waset.org/abstracts/search?q=vector%20quantization%20%28VQ%29" title=" vector quantization (VQ)"> vector quantization (VQ)</a>, <a href="https://publications.waset.org/abstracts/search?q=Linde-Buzo-Gray%20%28LBG%29" title=" Linde-Buzo-Gray (LBG)"> Linde-Buzo-Gray (LBG)</a>, <a href="https://publications.waset.org/abstracts/search?q=expectation%20maximization%20%28EM%29" title=" expectation maximization (EM)"> expectation maximization (EM)</a>, <a href="https://publications.waset.org/abstracts/search?q=pre-processing" title=" pre-processing"> pre-processing</a>, <a href="https://publications.waset.org/abstracts/search?q=voice%20activity%20detection%20%28VAD%29" title=" voice activity detection (VAD)"> voice activity detection (VAD)</a>, <a href="https://publications.waset.org/abstracts/search?q=short%20time%20energy%20%28STE%29" title=" short time energy (STE)"> short time energy (STE)</a>, <a href="https://publications.waset.org/abstracts/search?q=background%20noise%20statistical%20modeling" title=" background noise statistical modeling"> background noise statistical modeling</a>, <a href="https://publications.waset.org/abstracts/search?q=closed-set%20tex-independent%20speaker%20identification%20system%20%28CISI%29" title=" closed-set tex-independent speaker identification system (CISI)"> closed-set tex-independent speaker identification system (CISI)</a> </p> <a href="https://publications.waset.org/abstracts/16253/an-intelligent-text-independent-speaker-identification-using-vq-gmm-model-based-multiple-classifier-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/16253.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">309</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">14</span> Features of Normative and Pathological Realizations of Sibilant Sounds for Computer-Aided Pronunciation Evaluation in Children</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Zuzanna%20Miodonska">Zuzanna Miodonska</a>, <a href="https://publications.waset.org/abstracts/search?q=Michal%20Krecichwost"> Michal Krecichwost</a>, <a href="https://publications.waset.org/abstracts/search?q=Pawel%20Badura"> Pawel Badura</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Sigmatism (lisping) is a speech disorder in which sibilant consonants are mispronounced. The diagnosis of this phenomenon is usually based on the auditory assessment. However, the progress in speech analysis techniques creates a possibility of developing computer-aided sigmatism diagnosis tools. The aim of the study is to statistically verify whether specific acoustic features of sibilant sounds may be related to pronunciation correctness. Such knowledge can be of great importance while implementing classifiers and designing novel tools for automatic sibilants pronunciation evaluation. The study covers analysis of various speech signal measures, including features proposed in the literature for the description of normative sibilants realization. Amplitudes and frequencies of three fricative formants (FF) are extracted based on local spectral maxima of the friction noise. Skewness, kurtosis, four normalized spectral moments (SM) and 13 mel-frequency cepstral coefficients (MFCC) with their 1st and 2nd derivatives (13 Delta and 13 Delta-Delta MFCC) are included in the analysis as well. The resulting feature vector contains 51 measures. The experiments are performed on the speech corpus containing words with selected sibilant sounds (/ʃ, ʒ/) pronounced by 60 preschool children with proper pronunciation or with natural pathologies. In total, 224 /ʃ/ segments and 191 /ʒ/ segments are employed in the study. The Mann-Whitney U test is employed for the analysis of stigmatism and normative pronunciation. Statistically, significant differences are obtained in most of the proposed features in children divided into these two groups at p < 0.05. All spectral moments and fricative formants appear to be distinctive between pathology and proper pronunciation. These metrics describe the friction noise characteristic for sibilants, which makes them particularly promising for the use in sibilants evaluation tools. Correspondences found between phoneme feature values and an expert evaluation of the pronunciation correctness encourage to involve speech analysis tools in diagnosis and therapy of sigmatism. Proposed feature extraction methods could be used in a computer-assisted stigmatism diagnosis or therapy systems. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=computer-aided%20pronunciation%20evaluation" title="computer-aided pronunciation evaluation">computer-aided pronunciation evaluation</a>, <a href="https://publications.waset.org/abstracts/search?q=sigmatism%20diagnosis" title=" sigmatism diagnosis"> sigmatism diagnosis</a>, <a href="https://publications.waset.org/abstracts/search?q=speech%20signal%20analysis" title=" speech signal analysis"> speech signal analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=statistical%20verification" title=" statistical verification"> statistical verification</a> </p> <a href="https://publications.waset.org/abstracts/65569/features-of-normative-and-pathological-realizations-of-sibilant-sounds-for-computer-aided-pronunciation-evaluation-in-children" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/65569.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">301</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">13</span> The Combination of the Mel Frequency Cepstral Coefficients, Perceptual Linear Prediction, Jitter and Shimmer Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Brahim%20Fares%20Zaidi">Brahim Fares Zaidi</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Our work aims to improve our Automatic Recognition System for Dysarthria Speech based on the Hidden Models of Markov and the Hidden Markov Model Toolkit to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients and Perceptual Linear Prediction and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=ARSDS" title="ARSDS">ARSDS</a>, <a href="https://publications.waset.org/abstracts/search?q=HTK" title=" HTK"> HTK</a>, <a href="https://publications.waset.org/abstracts/search?q=HMM" title=" HMM"> HMM</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=PLP" title=" PLP"> PLP</a> </p> <a href="https://publications.waset.org/abstracts/158636/the-combination-of-the-mel-frequency-cepstral-coefficients-perceptual-linear-prediction-jitter-and-shimmer-coefficients-for-the-improvement-of-automatic-recognition-system-for-dysarthric-speech" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/158636.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">108</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">12</span> Music Genre Classification Based on Non-Negative Matrix Factorization Features</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Soyon%20Kim">Soyon Kim</a>, <a href="https://publications.waset.org/abstracts/search?q=Edward%20Kim"> Edward Kim</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=mel-frequency%20cepstral%20coefficient%20%28MFCC%29" title="mel-frequency cepstral coefficient (MFCC)">mel-frequency cepstral coefficient (MFCC)</a>, <a href="https://publications.waset.org/abstracts/search?q=music%20genre%20classification" title=" music genre classification"> music genre classification</a>, <a href="https://publications.waset.org/abstracts/search?q=non-negative%20matrix%20factorization%20%28NMF%29" title=" non-negative matrix factorization (NMF)"> non-negative matrix factorization (NMF)</a>, <a href="https://publications.waset.org/abstracts/search?q=support%20vector%20machine%20%28SVM%29" title=" support vector machine (SVM)"> support vector machine (SVM)</a> </p> <a href="https://publications.waset.org/abstracts/89349/music-genre-classification-based-on-non-negative-matrix-factorization-features" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/89349.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">303</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">11</span> A Non-Parametric Based Mapping Algorithm for Use in Audio Fingerprinting</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Analise%20Borg">Analise Borg</a>, <a href="https://publications.waset.org/abstracts/search?q=Paul%20Micallef"> Paul Micallef</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Over the past few years, the online multimedia collection has grown at a fast pace. Several companies showed interest to study the different ways to organize the amount of audio information without the need of human intervention to generate metadata. In the past few years, many applications have emerged on the market which are capable of identifying a piece of music in a short time. Different audio effects and degradation make it much harder to identify the unknown piece. In this paper, an audio fingerprinting system which makes use of a non-parametric based algorithm is presented. Parametric analysis is also performed using Gaussian Mixture Models (GMMs). The feature extraction methods employed are the Mel Spectrum Coefficients and the MPEG-7 basic descriptors. Bin numbers replaced the extracted feature coefficients during the non-parametric modelling. The results show that non-parametric analysis offer potential results as the ones mentioned in the literature. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=audio%20fingerprinting" title="audio fingerprinting">audio fingerprinting</a>, <a href="https://publications.waset.org/abstracts/search?q=mapping%20algorithm" title=" mapping algorithm"> mapping algorithm</a>, <a href="https://publications.waset.org/abstracts/search?q=Gaussian%20Mixture%20Models" title=" Gaussian Mixture Models"> Gaussian Mixture Models</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=MPEG-7" title=" MPEG-7"> MPEG-7</a> </p> <a href="https://publications.waset.org/abstracts/22201/a-non-parametric-based-mapping-algorithm-for-use-in-audio-fingerprinting" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/22201.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">421</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">10</span> Modelling Ibuprofen with Human Albumin</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=U.%20L.%20Fulco">U. L. Fulco</a>, <a href="https://publications.waset.org/abstracts/search?q=E.%20L.%20Albuquerque"> E. L. Albuquerque</a>, <a href="https://publications.waset.org/abstracts/search?q=Jos%C3%A9%20X.%20Lima%20Neto"> José X. Lima Neto</a>, <a href="https://publications.waset.org/abstracts/search?q=L.%20R.%20Da%20Silva"> L. R. Da Silva</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The binding of the nonsteroidal anti-inflammatory drug ibuprofen (IBU) to human serum albumin (HSA) is investigated using density functional theory (DFT) calculations within a fragmentation strategy. Crystallographic data for the IBU–HSA supramolecular complex shows that the ligand is confined to a large cavity at the subdomain IIIA and at the interface between the subdomains IIA and IIB, whose binding sites are FA3/FA4 and FA6, respectively. The interaction energy between the IBU molecule and each amino acid residue of these HSA binding pockets was calculated using the Molecular Fractionation with Conjugate Caps (MFCC) approach employing a dispersion corrected exchange–correlation functional. Our investigation shows that the total interaction energy of IBU bound to HSA at binding sites of the fatty acids FA3/FA4 (FA6) converges only for a pocket radius of at least 8.5 °A, mainly due to the action of residues Arg410, Lys414 and Ser489 (Lys351, Ser480 and Leu481) and residues in nonhydrophobic domains, namely Ile388, Phe395, Phe403, Leu407, Leu430, Val433, and Leu453 (Phe206, Ala210, Ala213, and Leu327), which is unusual. Our simulations are valuable for a better understanding of the binding mechanism of IBU to albumin and can lead to the rational design and the development of novel IBU-derived drugs with improved potency. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=ibuprofen" title="ibuprofen">ibuprofen</a>, <a href="https://publications.waset.org/abstracts/search?q=human%20serum%20albumin" title=" human serum albumin"> human serum albumin</a>, <a href="https://publications.waset.org/abstracts/search?q=density%20functional%20theory" title=" density functional theory"> density functional theory</a>, <a href="https://publications.waset.org/abstracts/search?q=binding%20energies" title=" binding energies"> binding energies</a> </p> <a href="https://publications.waset.org/abstracts/46649/modelling-ibuprofen-with-human-albumin" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/46649.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">347</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">9</span> Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Tomyslav%20Sledevi%C4%8D">Tomyslav Sledevič</a>, <a href="https://publications.waset.org/abstracts/search?q=Art%C5%ABras%20Serackis"> Artūras Serackis</a>, <a href="https://publications.waset.org/abstracts/search?q=Gintautas%20Tamulevi%C4%8Dius"> Gintautas Tamulevičius</a>, <a href="https://publications.waset.org/abstracts/search?q=Dalius%20Navakauskas"> Dalius Navakauskas</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This paper presents a comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in the speaker-dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signals to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients give best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose, the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfies the real-time requirements and is suitable for applications in embedded systems. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=isolated%20word%20recognition" title="isolated word recognition">isolated word recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=features%20extraction" title=" features extraction"> features extraction</a>, <a href="https://publications.waset.org/abstracts/search?q=MFCC" title=" MFCC"> MFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=LFCC" title=" LFCC"> LFCC</a>, <a href="https://publications.waset.org/abstracts/search?q=LPCC" title=" LPCC"> LPCC</a>, <a href="https://publications.waset.org/abstracts/search?q=LPC" title=" LPC"> LPC</a>, <a href="https://publications.waset.org/abstracts/search?q=FPGA" title=" FPGA"> FPGA</a>, <a href="https://publications.waset.org/abstracts/search?q=DTW" title=" DTW"> DTW</a> </p> <a href="https://publications.waset.org/abstracts/2136/evaluation-of-features-extraction-algorithms-for-a-real-time-isolated-word-recognition-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/2136.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">496</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">8</span> The Optimum Mel-Frequency Cepstral Coefficients (MFCCs) Contribution to Iranian Traditional Music Genre Classification by Instrumental Features</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=M.%20Abbasi%20Layegh">M. Abbasi Layegh</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20Haghipour"> S. Haghipour</a>, <a href="https://publications.waset.org/abstracts/search?q=K.%20Athari"> K. Athari</a>, <a href="https://publications.waset.org/abstracts/search?q=R.%20Khosravi"> R. Khosravi</a>, <a href="https://publications.waset.org/abstracts/search?q=M.%20Tafkikialamdari"> M. Tafkikialamdari</a> </p> <p class="card-text"><strong>Abstract:</strong></p> An approach to find the optimum mel-frequency cepstral coefficients (MFCCs) for the Radif of Mirzâ Ábdollâh, which is the principal emblem and the heart of Persian music, performed by most famous Iranian masters on two Iranian stringed instruments ‘Tar’ and ‘Setar’ is proposed. While investigating the variance of MFCC for each record in themusic database of 1500 gushe of the repertoire belonging to 12 modal systems (dastgâh and âvâz), we have applied the Fuzzy C-Mean clustering algorithm on each of the 12 coefficient and different combinations of those coefficients. We have applied the same experiment while increasing the number of coefficients but the clustering accuracy remained the same. Therefore, we can conclude that the first 7 MFCCs (V-7MFCC) are enough for classification of The Radif of Mirzâ Ábdollâh. Classical machine learning algorithms such as MLP neural networks, K-Nearest Neighbors (KNN), Gaussian Mixture Model (GMM), Hidden Markov Model (HMM) and Support Vector Machine (SVM) have been employed. Finally, it can be realized that SVM shows a better performance in this study. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=radif%20of%20Mirz%C3%A2%20%C3%81bdoll%C3%A2h" title="radif of Mirzâ Ábdollâh">radif of Mirzâ Ábdollâh</a>, <a href="https://publications.waset.org/abstracts/search?q=Gushe" title=" Gushe"> Gushe</a>, <a href="https://publications.waset.org/abstracts/search?q=mel%20frequency%20cepstral%20coefficients" title=" mel frequency cepstral coefficients"> mel frequency cepstral coefficients</a>, <a href="https://publications.waset.org/abstracts/search?q=fuzzy%20c-mean%20clustering%20algorithm" title=" fuzzy c-mean clustering algorithm"> fuzzy c-mean clustering algorithm</a>, <a href="https://publications.waset.org/abstracts/search?q=k-nearest%20neighbors%20%28KNN%29" title=" k-nearest neighbors (KNN)"> k-nearest neighbors (KNN)</a>, <a href="https://publications.waset.org/abstracts/search?q=gaussian%20mixture%20model%20%28GMM%29" title=" gaussian mixture model (GMM)"> gaussian mixture model (GMM)</a>, <a href="https://publications.waset.org/abstracts/search?q=hidden%20markov%20model%20%28HMM%29" title=" hidden markov model (HMM)"> hidden markov model (HMM)</a>, <a href="https://publications.waset.org/abstracts/search?q=support%20vector%20machine%20%28SVM%29" title=" support vector machine (SVM)"> support vector machine (SVM)</a> </p> <a href="https://publications.waset.org/abstracts/37296/the-optimum-mel-frequency-cepstral-coefficients-mfccs-contribution-to-iranian-traditional-music-genre-classification-by-instrumental-features" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/37296.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">446</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7</span> Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Bipasha%20Sen">Bipasha Sen</a>, <a href="https://publications.waset.org/abstracts/search?q=Aditya%20Agarwal"> Aditya Agarwal</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=convolutional%20neural%20networks" title="convolutional neural networks">convolutional neural networks</a>, <a href="https://publications.waset.org/abstracts/search?q=language%20compatibility" title=" language compatibility"> language compatibility</a>, <a href="https://publications.waset.org/abstracts/search?q=low%20resource%20languages" title=" low resource languages"> low resource languages</a>, <a href="https://publications.waset.org/abstracts/search?q=multilingual%20automatic%20speech%20recognition" title=" multilingual automatic speech recognition"> multilingual automatic speech recognition</a> </p> <a href="https://publications.waset.org/abstracts/129912/reed-an-approach-towards-quickly-bootstrapping-multilingual-acoustic-models" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/129912.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">123</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">6</span> Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Aishwarya%20Ravindra%20Fursule">Aishwarya Ravindra Fursule</a>, <a href="https://publications.waset.org/abstracts/search?q=Shruti%20Kshirsagar"> Shruti Kshirsagar</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=comparison" title="comparison">comparison</a>, <a href="https://publications.waset.org/abstracts/search?q=ML%20classifiers" title=" ML classifiers"> ML classifiers</a>, <a href="https://publications.waset.org/abstracts/search?q=KNN" title=" KNN"> KNN</a>, <a href="https://publications.waset.org/abstracts/search?q=decision%20tree" title=" decision tree"> decision tree</a>, <a href="https://publications.waset.org/abstracts/search?q=SVM" title=" SVM"> SVM</a>, <a href="https://publications.waset.org/abstracts/search?q=random%20forest" title=" random forest"> random forest</a>, <a href="https://publications.waset.org/abstracts/search?q=logistic%20regression" title=" logistic regression"> logistic regression</a>, <a href="https://publications.waset.org/abstracts/search?q=ensemble%20classifiers" title=" ensemble classifiers"> ensemble classifiers</a> </p> <a href="https://publications.waset.org/abstracts/185323/comparison-study-of-machine-learning-classifiers-for-speech-emotion-recognition" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/185323.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">45</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">5</span> Biosignal Recognition for Personal Identification</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hadri%20Hussain">Hadri Hussain</a>, <a href="https://publications.waset.org/abstracts/search?q=M.Nasir%20Ibrahim"> M.Nasir Ibrahim</a>, <a href="https://publications.waset.org/abstracts/search?q=Chee-Ming%20Ting"> Chee-Ming Ting</a>, <a href="https://publications.waset.org/abstracts/search?q=Mariani%20Idroas"> Mariani Idroas</a>, <a href="https://publications.waset.org/abstracts/search?q=Fuad%20Numan"> Fuad Numan</a>, <a href="https://publications.waset.org/abstracts/search?q=Alias%20Mohd%20Noor"> Alias Mohd Noor</a> </p> <p class="card-text"><strong>Abstract:</strong></p> A biometric security system has become an important application in client identification and verification system. A conventional biometric system is normally based on unimodal biometric that depends on either behavioural or physiological information for authentication purposes. The behavioural biometric depends on human body biometric signal (such as speech) and biosignal biometric (such as electrocardiogram (ECG) and phonocardiogram or heart sound (HS)). The speech signal is commonly used in a recognition system in biometric, while the ECG and the HS have been used to identify a person’s diseases uniquely related to its cluster. However, the conventional biometric system is liable to spoof attack that will affect the performance of the system. Therefore, a multimodal biometric security system is developed, which is based on biometric signal of ECG, HS, and speech. The biosignal data involved in the biometric system is initially segmented, with each segment Mel Frequency Cepstral Coefficients (MFCC) method is exploited for extracting the feature. The Hidden Markov Model (HMM) is used to model the client and to classify the unknown input with respect to the modal. The recognition system involved training and testing session that is known as client identification (CID). In this project, twenty clients are tested with the developed system. The best overall performance at 44 kHz was 93.92% for ECG and the worst overall performance was ECG at 88.47%. The results were compared to the best overall performance at 44 kHz for (20clients) to increment of clients, which was 90.00% for HS and the worst overall performance falls at ECG at 79.91%. It can be concluded that the difference multimodal biometric has a substantial effect on performance of the biometric system and with the increment of data, even with higher frequency sampling, the performance still decreased slightly as predicted. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=electrocardiogram" title="electrocardiogram">electrocardiogram</a>, <a href="https://publications.waset.org/abstracts/search?q=phonocardiogram" title=" phonocardiogram"> phonocardiogram</a>, <a href="https://publications.waset.org/abstracts/search?q=hidden%20markov%20model" title=" hidden markov model"> hidden markov model</a>, <a href="https://publications.waset.org/abstracts/search?q=mel%20frequency%20cepstral%20coeffiecients" title=" mel frequency cepstral coeffiecients"> mel frequency cepstral coeffiecients</a>, <a href="https://publications.waset.org/abstracts/search?q=client%20identification" title=" client identification"> client identification</a> </p> <a href="https://publications.waset.org/abstracts/48382/biosignal-recognition-for-personal-identification" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/48382.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">280</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">4</span> Development of a Sequential Multimodal Biometric System for Web-Based Physical Access Control into a Security Safe</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Babatunde%20Olumide%20Olawale">Babatunde Olumide Olawale</a>, <a href="https://publications.waset.org/abstracts/search?q=Oyebode%20Olumide%20Oyediran"> Oyebode Olumide Oyediran</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The security safe is a place or building where classified document and precious items are kept. To prevent unauthorised persons from gaining access to this safe a lot of technologies had been used. But frequent reports of an unauthorised person gaining access into security safes with the aim of removing document and items from the safes are pointers to the fact that there is still security gap in the recent technologies used as access control for the security safe. In this paper we try to solve this problem by developing a multimodal biometric system for physical access control into a security safe using face and voice recognition. The safe is accessed by the combination of face and speech pattern recognition and also in that sequential order. User authentication is achieved through the use of camera/sensor unit and a microphone unit both attached to the door of the safe. The user face was captured by the camera/sensor while the speech was captured by the use of the microphone unit. The Scale Invariance Feature Transform (SIFT) algorithm was used to train images to form templates for the face recognition system while the Mel-Frequency Cepitral Coefficients (MFCC) algorithm was used to train the speech recognition system to recognise authorise user’s speech. Both algorithms were hosted in two separate web based servers and for automatic analysis of our work; our developed system was simulated in a MATLAB environment. The results obtained shows that the developed system was able to give access to authorise users while declining unauthorised person access to the security safe. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=access%20control" title="access control">access control</a>, <a href="https://publications.waset.org/abstracts/search?q=multimodal%20biometrics" title=" multimodal biometrics"> multimodal biometrics</a>, <a href="https://publications.waset.org/abstracts/search?q=pattern%20recognition" title=" pattern recognition"> pattern recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=security%20safe" title=" security safe"> security safe</a> </p> <a href="https://publications.waset.org/abstracts/73150/development-of-a-sequential-multimodal-biometric-system-for-web-based-physical-access-control-into-a-security-safe" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/73150.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">335</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">3</span> Speech Emotion Recognition: A DNN and LSTM Comparison in Single and Multiple Feature Application</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Thiago%20Spilborghs%20Bueno%20Meyer">Thiago Spilborghs Bueno Meyer</a>, <a href="https://publications.waset.org/abstracts/search?q=Plinio%20Thomaz%20Aquino%20Junior"> Plinio Thomaz Aquino Junior</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Through speech, which privileges the functional and interactive nature of the text, it is possible to ascertain the spatiotemporal circumstances, the conditions of production and reception of the discourse, the explicit purposes such as informing, explaining, convincing, etc. These conditions allow bringing the interaction between humans closer to the human-robot interaction, making it natural and sensitive to information. However, it is not enough to understand what is said; it is necessary to recognize emotions for the desired interaction. The validity of the use of neural networks for feature selection and emotion recognition was verified. For this purpose, it is proposed the use of neural networks and comparison of models, such as recurrent neural networks and deep neural networks, in order to carry out the classification of emotions through speech signals to verify the quality of recognition. It is expected to enable the implementation of robots in a domestic environment, such as the HERA robot from the RoboFEI@Home team, which focuses on autonomous service robots for the domestic environment. Tests were performed using only the Mel-Frequency Cepstral Coefficients, as well as tests with several characteristics of Delta-MFCC, spectral contrast, and the Mel spectrogram. To carry out the training, validation and testing of the neural networks, the eNTERFACE’05 database was used, which has 42 speakers from 14 different nationalities speaking the English language. The data from the chosen database are videos that, for use in neural networks, were converted into audios. It was found as a result, a classification of 51,969% of correct answers when using the deep neural network, when the use of the recurrent neural network was verified, with the classification with accuracy equal to 44.09%. The results are more accurate when only the Mel-Frequency Cepstral Coefficients are used for the classification, using the classifier with the deep neural network, and in only one case, it is possible to observe a greater accuracy by the recurrent neural network, which occurs in the use of various features and setting 73 for batch size and 100 training epochs. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=emotion%20recognition" title="emotion recognition">emotion recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=speech" title=" speech"> speech</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=human-robot%20interaction" title=" human-robot interaction"> human-robot interaction</a>, <a href="https://publications.waset.org/abstracts/search?q=neural%20networks" title=" neural networks"> neural networks</a> </p> <a href="https://publications.waset.org/abstracts/145908/speech-emotion-recognition-a-dnn-and-lstm-comparison-in-single-and-multiple-feature-application" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/145908.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">170</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">2</span> Classification of Emotions in Emergency Call Center Conversations</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Magdalena%20Igras">Magdalena Igras</a>, <a href="https://publications.waset.org/abstracts/search?q=Joanna%20Grzybowska"> Joanna Grzybowska</a>, <a href="https://publications.waset.org/abstracts/search?q=Mariusz%20Zi%C3%B3%C5%82ko"> Mariusz Ziółko</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The study of emotions expressed in emergency phone call is presented, covering both statistical analysis of emotions configurations and an attempt to automatically classify emotions. An emergency call is a situation usually accompanied by intense, authentic emotions. They influence (and may inhibit) the communication between caller and responder. In order to support responders in their responsible and psychically exhaustive work, we studied when and in which combinations emotions appeared in calls. A corpus of 45 hours of conversations (about 3300 calls) from emergency call center was collected. Each recording was manually tagged with labels of emotions valence (positive, negative or neutral), type (sadness, tiredness, anxiety, surprise, stress, anger, fury, calm, relief, compassion, satisfaction, amusement, joy) and arousal (weak, typical, varying, high) on the basis of perceptual judgment of two annotators. As we concluded, basic emotions tend to appear in specific configurations depending on the overall situational context and attitude of speaker. After performing statistical analysis we distinguished four main types of emotional behavior of callers: worry/helplessness (sadness, tiredness, compassion), alarm (anxiety, intense stress), mistake or neutral request for information (calm, surprise, sometimes with amusement) and pretension/insisting (anger, fury). The frequency of profiles was respectively: 51%, 21%, 18% and 8% of recordings. A model of presenting the complex emotional profiles on the two-dimensional (tension-insecurity) plane was introduced. In the stage of acoustic analysis, a set of prosodic parameters, as well as Mel-Frequency Cepstral Coefficients (MFCC) were used. Using these parameters, complex emotional states were modeled with machine learning techniques including Gaussian mixture models, decision trees and discriminant analysis. Results of classification with several methods will be presented and compared with the state of the art results obtained for classification of basic emotions. Future work will include optimization of the algorithm to perform in real time in order to track changes of emotions during a conversation. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=acoustic%20analysis" title="acoustic analysis">acoustic analysis</a>, <a href="https://publications.waset.org/abstracts/search?q=complex%20emotions" title=" complex emotions"> complex emotions</a>, <a href="https://publications.waset.org/abstracts/search?q=emotion%20recognition" title=" emotion recognition"> emotion recognition</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a> </p> <a href="https://publications.waset.org/abstracts/23163/classification-of-emotions-in-emergency-call-center-conversations" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/23163.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">398</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">1</span> Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Anup%20Anand%20Deshmukh">Anup Anand Deshmukh</a>, <a href="https://publications.waset.org/abstracts/search?q=Catherine%20Soladie"> Catherine Soladie</a>, <a href="https://publications.waset.org/abstracts/search?q=Renaud%20Seguier"> Renaud Seguier</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title="deep learning">deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=brain%20chemistry" title=" brain chemistry"> brain chemistry</a>, <a href="https://publications.waset.org/abstracts/search?q=emotion%20perception" title=" emotion perception"> emotion perception</a>, <a href="https://publications.waset.org/abstracts/search?q=Lovheim%27s%20cube" title=" Lovheim's cube"> Lovheim's cube</a> </p> <a href="https://publications.waset.org/abstracts/101103/emotion-convolutional-neural-network-for-perceiving-stress-from-audio-signals-a-brain-chemistry-approach" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/101103.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">154</span> </span> </div> </div> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">© 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">×</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>