CINXE.COM
{"title":"Speech Intelligibility Improvement Using Variable Level Decomposition DWT","authors":"Samba Raju, Chiluveru, Manoj Tripathy","volume":157,"journal":"International Journal of Electronics and Communication Engineering","pagesStart":23,"pagesEnd":27,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10011030","abstract":"Intelligibility is an essential characteristic of a speech<br \/>\r\nsignal, which is used to help in the understanding of information in<br \/>\r\nspeech signal. Background noise in the environment can deteriorate<br \/>\r\nthe intelligibility of a recorded speech. In this paper, we presented a<br \/>\r\nsimple variance subtracted - variable level discrete wavelet transform,<br \/>\r\nwhich improve the intelligibility of speech. The proposed algorithm<br \/>\r\ndoes not require an explicit estimation of noise, i.e., prior knowledge<br \/>\r\nof the noise; hence, it is easy to implement, and it reduces the<br \/>\r\ncomputational burden. The proposed algorithm decides a separate<br \/>\r\ndecomposition level for each frame based on signal dominant and<br \/>\r\ndominant noise criteria. The performance of the proposed algorithm<br \/>\r\nis evaluated with speech intelligibility measure (STOI), and results<br \/>\r\nobtained are compared with Universal Discrete Wavelet Transform<br \/>\r\n(DWT) thresholding and Minimum Mean Square Error (MMSE)<br \/>\r\nmethods. The experimental results revealed that the proposed scheme<br \/>\r\noutperformed competing methods","references":"[1] P. C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton,\r\nFL, USA: CRC press, 2007.\r\n[2] Y. Ephraim and D. Malah, \u201cSpeech enhancement using a\r\nMinimum-Mean Square Error Short-Time Spectral Amplitude\r\nestimator,\u201d IEEE Transactions on Acoustics, Speech, and Signal\r\nProcessing, vol. 32, no. 6, pp. 1109\u20131121, 1984.\r\n[3] S. G. Mallat, \u201cA Theory for Multiresolution Signal Decomposition: The\r\nWavelet Representation,\u201d IEEE Transactions on Pattern Analysis and\r\nMachine Intelligence, vol. 11, no. 7, pp. 674\u2013693, 1989.\r\n[4] G. Kim and P. C. Loizou, \u201cImproving Speech Intelligibility in\r\nNoise using Environment-Optimized Algorithms,\u201d IEEE Transactions on\r\nAudio, Speech, and Language Processing, vol. 18, no. 8, pp. 2080\u20132090,\r\n2010.\r\n[5] P. C. Loizou and G. Kim, \u201cReasons Why Current Speech-Enhancement\r\nAlgorithms do not Improve Speech Intelligibility and Suggested\r\nSolutions,\u201d IEEE Transactions on Audio, Speech, and Language\r\nProcessing, vol. 19, no. 1, pp. 47\u201356, 2010.\r\n[6] D. Wang and J. Chen, \u201cSupervised peech separation based on deep\r\nlearning: An overview,\u201d IEEE Transactions on Audio, Speech, and\r\nLanguage Processing, vol. 26, no. 10, pp. 1702\u20131726, 2018.\r\n[7] M. Kolbk, Z.-H. Tan, J. Jensen, M. Kolbk, Z.-H. Tan, and J. Jensen,\r\n\u201cSpeech Intelligibility Potential of General and Specialized Deep Neural\r\nNetwork based Speech Enhancement Systems,\u201d IEEE Transactions on\r\nAudio, Speech, and Language Processing, vol. 25, no. 1, pp. 153\u2013167,\r\n2017.\r\n[8] S. Y. Low, D. S. Pham, and S. Venkatesh, \u201cCompressive Speech\r\nEnhancement,\u201d Speech Communication, vol. 55, no. 6, pp. 757\u2013768,\r\n2013.\r\n[9] M. Srivastava, C. L. Anderson, and J. H. Freed, \u201cA New Wavelet\r\nDenoising Method for Selecting Decomposition Levels and Noise\r\nThresholds,\u201d IEEE Access, vol. 4, pp. 3862\u20133877, 2016.\r\n[10] J. S. Garofolo et al., \u201cGetting started with the DARPA TIMIT CD-ROM:\r\nAn acoustic phonetic continuous speech database,\u201d National Institute of\r\nStandards and Technology (NIST), Gaithersburgh, MD, vol. 107, pp.\r\n1\u20136, 1988.\r\n[11] A. Varga and H. J. Steeneken, \u201cAssessment for Automatic Speech\r\nRecognition: II. NOISEX-92: A Database and an Experiment to Study\r\nthe Effect of Additive Noise on Speech Recognition Systems,\u201d Speech\r\ncommunication, vol. 12, no. 3, pp. 247\u2013251, 1993.\r\n[12] D. L. Donoho and J. M. Johnstone, \u201cIdeal Spatial Adaptation by Wavelet\r\nShrinkage,\u201d biometrika, vol. 81, no. 3, pp. 425\u2013455, 1994.\r\n[13] D. L. Donoho, \u201cDe-noising by soft-thresholding,\u201d IEEE Transactions on\r\ninformation Theory, vol. 41, no. 3, pp. 613\u2013627, 1995.\r\n[14] C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, \u201cAn\r\nAlgorithm for Intelligibility Prediction of Time\u2013Frequency Weighted\r\nNoisy Speech,\u201d IEEE Transactions on Audio, Speech, and Language\r\nProcessing, vol. 19, no. 7, pp. 2125\u20132136, 2011.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 157, 2020"}