ISCA Archive

<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>ISCA Archive</title> <meta name="viewport" content="width=device-width, initial-scale=1"> <link rel="stylesheet" href="../resources/jquery.dataTables.min.css"> <script src="../resources/jquery-3.5.1.min.js"></script> <script src="../resources/jquery.dataTables.min.js"></script> <script src="../resources/accent-neutralise.js"></script>  <link rel="stylesheet" href="../resources/fontawesome-free-subset/style.css">  <link rel="stylesheet" href="../resources/w3.css"> <link rel="stylesheet" href="../resources/w3-theme-blue.css"> <script src="../resources/w3.js"></script>  <link rel="stylesheet" href="../resources/is.css"> </head> <body>  <div class="w3-top w3-hide-small"> <div class="w3-bar w3-grayscale-min w3-theme-d4 w3-center"> <a href="../../index.html" class="w3-bar-item w3-button w3-theme-d2 w3-mobile"> <i class="icon-home w3-margin-right"></i>ISCA </a> <a href="../index.html" class="w3-bar-item w3-button w3-mobile">Archive</a> <a href="#" class="w3-bar-item w3-button w3-mobile">Odyssey 2018</a> <a class="w3-bar-item w3-button w3-mobile" onclick="document.getElementById('sessionchooser').style.display='block'">Sessions </a> <a href="#bypaper" class="w3-bar-item w3-button w3-mobile"><i class="icon-search" style='margin-right:5px'></i>Search</a> <a href="http://www.odyssey2018.org" class="w3-bar-item w3-button w3-mobile w3-right">Website</a> <a href="odyssey_2018.pdf" class="w3-bar-item w3-button w3-mobile w3-right">Booklet</a> </div> </div>  <div class="w3-hide-large w3-hide-medium"> <div class="w3-bar w3-grayscale-min w3-theme-d4 w3-center"> <span class="w3-bar-item w3-button w3-mobile w3-opacity-max"> </span> <a href="../../index.html" class="w3-bar-item w3-button w3-mobile"> <i class="icon-home w3-margin-right"></i>ISCA </a> <a href="../index.html" class="w3-bar-item w3-button w3-mobile">Archive</a> <a href="#" class="w3-bar-item w3-button w3 w3-mobile" onclick="document.getElementById('sessionchooser').style.display='block'">Sessions </a> <a href="#bypaper" class="w3-bar-item w3-button w3-mobile"><i class="icon-search" style='margin-right:5px'></i>Search</a> <a href="http://www.odyssey2018.org" class="w3-bar-item w3-button w3-mobile w3-right">Website</a> <a href="odyssey_2018.pdf" class="w3-bar-item w3-button w3-mobile w3-right">Booklet</a> </div> </div>  <div id="help_papers" class="w3-modal"> <div class="w3-modal-content w3-card-4 w3-greyscale w3-theme-d4 w3-padding w3-bordered"> <div class="w3-container"> <span onclick="document.getElementById('help_papers').style.display='none'" class="w3-button w3-display-topright">×</span> <div class="w3-container"> <p class="w3-text">Click on column names to sort.</p> <p class="w3-text">Searching uses the 'and' of terms e.g. <span class='w3-monospace'>Smith Interspeech</span> matches all papers by Smith in any Interspeech. The order of terms is not significant.</p> <p class="w3-text">Use double quotes for exact phrasal matches e.g. <span class='w3-monospace'>"acoustic features"</span>.</p> <p class="w3-text">Case is ignored.</p> <p class="w3-text">Diacritics are optional e.g. <span class='w3-monospace'>lefevre</span> also matches <span class='w3-monospace'>lef猫vre</span> (but not vice versa).</p> <p class="w3-text">It can be useful to turn off spell-checking for the search box in your browser preferences.</p> <p class="w3-text">If you prefer to scroll rather than page, increase the number in the show entries dropdown.</p> </div> </div> </div> </div> <div class="w3-top w3-hide-medium w3-hide-large"> <div class="w3-bar w3-grayscale-min w3-theme-d4 w3-opacity-max"> <a href="#" class="w3-bar-item w3-button w3-theme-d2 w3-left">top</a> </div> </div> <div class="w3-grayscale w3-theme-l5">  <div class="w3-container" id="about"> <div class="w3-content" style="max-width:1100px;margin-top:50px; margin-bottom: 10px"> <h2 class="w3-center w3-padding-16"> <span class="w3-text">The Speaker and Language Recognition Workshop</span> </h2> <h5 class="w3-text w3-center"> Les Sables d'Olonne, France<br> 26-29 June 2018</h5> <br> <h5 class="w3-text w3-center">Chairs: Anthony Larcher and Jean-Fran莽ois Bonastre</h5> <pre class="w3-text w3-center">doi: 10.21437/Odyssey.2018</pre> </div> </div>  <div class="w3-container"> <div class="w3-content" style="max-width:1200px;margin-top: 10px"> <div class="w3-content" style="height:10px" id="Keynote: Els Kindt"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Keynote: Els Kindt</h4> <hr> <a class="w3-text" href="kindt18_odyssey.html"> <p> Speaker identification and Data protection <br> <span class="w3-text w3-text-theme"> Els Kindt </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Speaker Recognition I"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Speaker Recognition I</h4> <hr> <a class="w3-text" href="ajili18_odyssey.html"> <p> Impact of rhythm on forensic voice comparison reliability <br> <span class="w3-text w3-text-theme"> Moez Ajili, Solange Rossato, Dan Zhang, Jean-Fran莽ois Bonastre </span> </p> </a> <a class="w3-text" href="brown18_odyssey.html"> <p> Segmental Content Effects on Text-dependent Automatic Accent Recognition <br> <span class="w3-text w3-text-theme"> Georgina Brown </span> </p> </a> <a class="w3-text" href="nautsch18_odyssey.html"> <p> Homomorphic Encryption for Speaker Recognition: Protection of Biometric Templates and Vendor Model Parameters <br> <span class="w3-text w3-text-theme"> Andreas Nautsch, Sergey Isadskiy, Jascha Kolberg, Marta Gomez-Barrero, Christoph Busch </span> </p> </a> <a class="w3-text" href="karu18_odyssey.html"> <p> Weakly Supervised Training of Speaker Identification Models <br> <span class="w3-text w3-text-theme"> Martin Karu, Tanel Alum盲e </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Language Recognition"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Language Recognition</h4> <hr> <a class="w3-text" href="padi18_odyssey.html"> <p> The LEAP Language Recognition System for LRE 2017 Challenge - Improvements and Error Analysis <br> <span class="w3-text w3-text-theme"> Bharat Padi, Shreyas Ramoji, Vaishnavi Yeruva, Satish Kumar, Sriram Ganapathy </span> </p> </a> <a class="w3-text" href="lozanodiez18_odyssey.html"> <p> Analysis of DNN-based Embeddings for Language Recognition on the NIST LRE 2017 <br> <span class="w3-text w3-text-theme"> Alicia Lozano-Diez, Oldrich Plchot, Pavel Matejka, Ondrej Novotny, Joaquin Gonzalez-Rodriguez </span> </p> </a> <a class="w3-text" href="plchot18_odyssey.html"> <p> Analysis of BUT-PT Submission for NIST LRE 2017 <br> <span class="w3-text w3-text-theme"> Old艡ich Plchot, Pavel Mat臎jka, Ond艡ej Novotn媒, Sandro Cumani, Alicia Lozano-Diez, Josef Slav铆膷ek, Mireia Diez, Franti拧ek Gr茅zl, Ond艡ej Glembek, Mounika Kamsali, Anna Silnova, Luk谩拧 Burget, Lucas Ondel, Santosh Kesiraju, Johan Rohdin </span> </p> </a> <a class="w3-text" href="richardson18_odyssey.html"> <p> The MIT Lincoln Laboratory / JHU / EPITA-LSE LRE17 System <br> <span class="w3-text w3-text-theme"> Fred Richardson, Pedro Torres-Carrasquillo, Jonas Borgstrom, Douglas Sturim, Youngjune Gwon, Jesus Villalba, Jan Trmal, Nanxin Chen, Reda Dehak, Najim Dehak </span> </p> </a> <a class="w3-text" href="trong18_odyssey.html"> <p> Staircase Network: structural language identification via hierarchical attentive units <br> <span class="w3-text w3-text-theme"> Trung Ngo Trong, Ville Hautamaki, Kristiina Jokinen </span> </p> </a> <a class="w3-text" href="mccree18_odyssey.html"> <p> Language Recognition for Telephone and Video Speech: The JHU HLTCOE Submission for NIST LRE17 <br> <span class="w3-text w3-text-theme"> Alan Mccree, David Snyder, Greg Sell, Daniel Garcia-Romero </span> </p> </a> <a class="w3-text" href="cai18_odyssey.html"> <p> Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System <br> <span class="w3-text w3-text-theme"> Weicheng Cai, Jinkun Chen, Ming Li </span> </p> </a> <a class="w3-text" href="sadjadi18_odyssey.html"> <p> The 2017 NIST Language Recognition Evaluation <br> <span class="w3-text w3-text-theme"> Seyed Omid Sadjadi, Timothee Kheyrkhah, Audrey Tong, Craig Greenberg, Douglas Reynolds, Elliot Singer, Lisa Mason, Jaime Hernandez-Cordero </span> </p> </a> <a class="w3-text" href="mclaren18b_odyssey.html"> <p> Approaches to Multi-domain Language Recognition <br> <span class="w3-text w3-text-theme"> Mitchell Mclaren, Mahesh Kumar Nandwana, Diego Cast谩n, Luciana Ferrer </span> </p> </a> <a class="w3-text" href="shon18_odyssey.html"> <p> Convolutional Neural Network and Language Embeddings for End-to-End Dialect Recognition <br> <span class="w3-text w3-text-theme"> Suwon Shon, Ahmed Ali, James Glass </span> </p> </a> <a class="w3-text" href="snyder18_odyssey.html"> <p> Spoken Language Recognition using X-vectors <br> <span class="w3-text w3-text-theme"> David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Daniel Povey, Sanjeev Khudanpur </span> </p> </a> <a class="w3-text" href="lopez18_odyssey.html"> <p> End-to-End versus Embedding Neural Networks for Language Recognition in Mismatched Conditions <br> <span class="w3-text w3-text-theme"> Jesus Antonio Villalba Lopez, Niko Brummer, Najim Dehak </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Speaker diarization"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Speaker diarization</h4> <hr> <a class="w3-text" href="alonilavi18_odyssey.html"> <p> Incremental On-Line Clustering of Speakers' Short Segments <br> <span class="w3-text w3-text-theme"> Ruth Aloni-Lavi, Irit Opher, Itshak Lapidot </span> </p> </a> <a class="w3-text" href="he18_odyssey.html"> <p> Latent Class Model for Single Channel Speaker Diarization <br> <span class="w3-text w3-text-theme"> Liang He, Xianhong Chen, Can Xu, Jia Liu </span> </p> </a> <a class="w3-text" href="chen18_odyssey.html"> <p> VB-HMM Speaker Diarization with Enhanced and Refined Segment Representation <br> <span class="w3-text w3-text-theme"> Xianhong Chen, Liang He, Can Xu, Yi Liu, Tianyu Liang, Jia Liu </span> </p> </a> <a class="w3-text" href="patino18_odyssey.html"> <p> Low-latency speaker spotting with online diarization and detection <br> <span class="w3-text w3-text-theme"> Jose Patino, Ruiqing Yin, H茅ctor Delgado, Herv茅 Bredin, Alain Komaty, Guillaume Wisniewski, Claude Barras, Nicholas Evans, S茅bastien Marcel </span> </p> </a> <a class="w3-text" href="diez18_odyssey.html"> <p> Speaker Diarization based on Bayesian HMM with Eigenvoice Priors <br> <span class="w3-text w3-text-theme"> Mireia Diez, Lukas Burget, Pavel Matejka </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Noise Robustness"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Noise Robustness</h4> <hr> <a class="w3-text" href="rahman18_odyssey.html"> <p> Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification <br> <span class="w3-text w3-text-theme"> Md Hafizur Rahman, Ivan Himawan, David Dean, Clinton Fookes, Sridha Sridharan </span> </p> </a> <a class="w3-text" href="lin18_odyssey.html"> <p> Reducing Domain Mismatch by Maximum Mean Discrepancy Based Autoencoders <br> <span class="w3-text w3-text-theme"> Weiwei Lin, Man-Wai Mak, Longxin Li, Jen-Tzung Chien </span> </p> </a> <a class="w3-text" href="novotny18_odyssey.html"> <p> On the use of X-vectors for Robust Speaker Recognition <br> <span class="w3-text w3-text-theme"> Ond艡ej Novotn媒, Old艡ich Plchot, Pavel Mat臎jka, Ladislav Mo拧ner, Ond艡ej Glembek </span> </p> </a> <a class="w3-text" href="alam18_odyssey.html"> <p> Speaker Verification in Mismatched Conditions with Frustratingly Easy Domain Adaptation <br> <span class="w3-text w3-text-theme"> Md Jahangir Alam, Gautam Bhattacharya, Patrick Kenny </span> </p> </a> <a class="w3-text" href="zhang18_odyssey.html"> <p> An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification <br> <span class="w3-text w3-text-theme"> Chunlei Zhang, Shivesh Ranjan, John Hansen </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Keynote: Simon King"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Keynote: Simon King</h4> <hr> <a class="w3-text" href="king18_odyssey.html"> <p> Speaking naturally? It depends who is listening. <br> <span class="w3-text w3-text-theme"> Simoin King </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Voice conversion"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Voice conversion</h4> <hr> <a class="w3-text" href="kinnunen18_odyssey.html"> <p> A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment <br> <span class="w3-text w3-text-theme"> Tomi Kinnunen, Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Zhenhua Ling </span> </p> </a> <a class="w3-text" href="lorenzotrueba18_odyssey.html"> <p> The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods <br> <span class="w3-text w3-text-theme"> Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling </span> </p> </a> <a class="w3-text" href="kobayashi18_odyssey.html"> <p> sprocket: Open-Source Voice Conversion Software <br> <span class="w3-text w3-text-theme"> Kazuhiro Kobayashi, Tomoki Toda </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Voice conversion and spoofing"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Voice conversion and spoofing</h4> <hr> <a class="w3-text" href="wu18_odyssey.html"> <p> The NU Non-Parallel Voice Conversion System for the Voice Conversion Challenge 2018 <br> <span class="w3-text w3-text-theme"> Yichiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda </span> </p> </a> <a class="w3-text" href="tobing18_odyssey.html"> <p> NU Voice Conversion System for the Voice Conversion Challenge 2018 <br> <span class="w3-text w3-text-theme"> Patrick Lumban Tobing, Yichiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda </span> </p> </a> <a class="w3-text" href="tian18_odyssey.html"> <p> Average Modeling Approach to Voice Conversion with Non-Parallel Data <br> <span class="w3-text w3-text-theme"> Xiaohai Tian, Junchao Wang, Haihua Xu, Eng-Siong Chng, Haizhou Li </span> </p> </a> <a class="w3-text" href="mochizuki18_odyssey.html"> <p> Voice liveness detection using phoneme-based pop-noise detector for speaker verification <br> <span class="w3-text w3-text-theme"> Shihono Mochizuki, Sayaka Shiota, Hitoshi Kiya </span> </p> </a> <a class="w3-text" href="lorenzotrueba18b_odyssey.html"> <p> Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama鈥檚 voice using GAN, WaveNet and low-quality found data <br> <span class="w3-text w3-text-theme"> Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen </span> </p> </a> <a class="w3-text" href="liu18_odyssey.html"> <p> The HCCL-CUHK System for the Voice Conversion Challenge 2018 <br> <span class="w3-text w3-text-theme"> Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng </span> </p> </a> <a class="w3-text" href="bahmaninezhad18_odyssey.html"> <p> Convolutional Neural Network Based Speaker De-Identification <br> <span class="w3-text w3-text-theme"> Fahimeh Bahmaninezhad, Chunlei Zhang, John Hansen </span> </p> </a> <a class="w3-text" href="sone18_odyssey.html"> <p> Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model <br> <span class="w3-text w3-text-theme"> Kentaro Sone, Shinji Takaki, Toru Nakashika </span> </p> </a> <a class="w3-text" href="sisman18_odyssey.html"> <p> Phonetically Aware Exemplar-Based Prosody Transformation <br> <span class="w3-text w3-text-theme"> Berrak Sisman, Grandee Lee, Haizhou Li </span> </p> </a> <a class="w3-text" href="kato18_odyssey.html"> <p> A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech <br> <span class="w3-text w3-text-theme"> Akihiro Kato, Tomi Kinnunen </span> </p> </a> <a class="w3-text" href="silnova18_odyssey.html"> <p> BUT/Phonexia Bottleneck Feature Extractor <br> <span class="w3-text w3-text-theme"> Anna Silnova, Pavel Matejka, Ondrej Glembek, Oldrich Plchot, Ondrej Novotny, Frantisek Grezl, Petr Schwarz, Lukas Burget, Jan Cernocky </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Spoofing"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Spoofing</h4> <hr> <a class="w3-text" href="valenti18b_odyssey.html"> <p> An end-to-end spoofing countermeasure for automatic speaker verification using evolving recurrent neural networks <br> <span class="w3-text w3-text-theme"> Giacomo Valenti, H茅ctor Delgado, Massimiliano Todisco, Nicholas Evans, Laurent Pilati </span> </p> </a> <a class="w3-text" href="delgado18_odyssey.html"> <p> ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements <br> <span class="w3-text w3-text-theme"> H茅ctor Delgado, Massimiliano Todisco, Md Sahidullah, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Junichi Yamagishi </span> </p> </a> <a class="w3-text" href="gonzalezrodriguez18_odyssey.html"> <p> An Audio Fingerprinting Approach to Replay Attack Detection on ASVSPOOF 2017 Challenge Data <br> <span class="w3-text w3-text-theme"> Joaquin Gonzalez-Rodriguez, Alvaro Escudero, Diego de Benito-Gorr贸n, Beltran Labrador, Javier Franco-Pedroso </span> </p> </a> <a class="w3-text" href="kinnunen18b_odyssey.html"> <p> t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification <br> <span class="w3-text w3-text-theme"> Tomi Kinnunen, Kong Aik Lee, Hector Delgado, Nicholas Evans, Massimiliano Todisco, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds </span> </p> </a> <a class="w3-text" href="hautamaki18_odyssey.html"> <p> Perceptual Evaluation of the Effectiveness of Voice Disguise by Age Modification <br> <span class="w3-text w3-text-theme"> Rosa Gonzalez Hautam盲ki, Anssi Kanervisto, Ville Hautamaki, Tomi Kinnunen </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Keynote: Pascal Belin"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Keynote: Pascal Belin</h4> <hr> <a class="w3-text" href="belin18_odyssey.html"> <p> A Vocal Brain: Cerebral Processing of Voice Information <br> <span class="w3-text w3-text-theme"> Pascal Belin </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Speaker recognition II"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Speaker recognition II</h4> <hr> <a class="w3-text" href="mclaren18_odyssey.html"> <p> How to train your speaker embeddings extractor <br> <span class="w3-text w3-text-theme"> Mitchell Mclaren, Diego Cast谩n, Mahesh Kumar Nandwana, Luciana Ferrer, Emre Yilmaz </span> </p> </a> <a class="w3-text" href="valenti18_odyssey.html"> <p> End-to-end automatic speaker verification with evolving recurrent neural networks <br> <span class="w3-text w3-text-theme"> Giacomo Valenti, Adrien Daniel, Nicholas Evans </span> </p> </a> <a class="w3-text" href="chien18_odyssey.html"> <p> Adversarial Learning and Augmentation for Speaker Recognition <br> <span class="w3-text w3-text-theme"> Jen-Tzung Chien, Kang-Ting Peng </span> </p> </a> <a class="w3-text" href="brummer18_odyssey.html"> <p> Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model <br> <span class="w3-text w3-text-theme"> Niko Brummer, Anna Silnova, Lukas Burget, Themos Stafylakis </span> </p> </a> <a class="w3-text" href="vestman18_odyssey.html"> <p> Supervector Compression Strategies to Speed up I-Vector System Development <br> <span class="w3-text w3-text-theme"> Ville Vestman, Tomi Kinnunen </span> </p> </a> </div> </div> <br> <div class="w3-content" style="height:10px" id="Text-dependent speaker recognition"></div> <div class="w3-card w3-round w3-white w3-padding"> <div class="w3-container" style="margin-top:40px"> <h4 class="w3-center">Text-dependent speaker recognition</h4> <hr> <a class="w3-text" href="shi18_odyssey.html"> <p> A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification <br> <span class="w3-text w3-text-theme"> Ziqiang Shi, Mengjiao Wang, Liu Liu, Huibin Lin, Rujie Liu </span> </p> </a> <a class="w3-text" href="zeinali18_odyssey.html"> <p> Spoken Pass-Phrase Verification in the i-vector Space <br> <span class="w3-text w3-text-theme"> Hossein Zeinali, Lukas Burget, Hossein Sameti, Honza Cernocky </span> </p> </a> <a class="w3-text" href="novoselov18_odyssey.html"> <p> On deep speaker embeddings for text-independent speaker recognition <br> <span class="w3-text w3-text-theme"> Sergey Novoselov, Andrey Shulipa, Ivan Kremnev, Alexandr Kozlov, Vadim Shchemelinin </span> </p> </a> <a class="w3-text" href="zeinali18b_odyssey.html"> <p> DeepMine Speech Processing Database: Text-Dependent and Independent Speaker Verification and Speech Recognition in Persian and English <br> <span class="w3-text w3-text-theme"> Hossein Zeinali, Hossein Sameti, Themos Stafylakis </span> </p> </a> <a class="w3-text" href="alam18b_odyssey.html"> <p> Boosting the Performance of Spoofing Detection Systems on Replay Attacks Using q-Logarithm Domain Feature Normalization <br> <span class="w3-text w3-text-theme"> Md Jahangir Alam, Gautam Bhattacharya, Patrick Kenny </span> </p> </a> </div> </div> <br> </div> </div>  <div class="w3-container" id="bypaper"> <div class="w3-content" style="max-width:1200px;margin-top:60px"> <div class="w3-container w3-card w3-padding w3-white"> <div class="w3-text w3-center"> <span class='w3-large'> <b>Search papers</b> </span> <button class='w3-text w3-button w3-right' onclick="document.getElementById('help_papers').style.display='block'"> <i class='icon-question-circle'></i> </button> </div> <table id="paper_table" class="display" style="width:95%"> <thead> <tr> <th width="100%">Article</th> <th width="0%"></th> <th width="0%"></th> <th width="0%"></th> </tr> </thead> </table> </div>  </div> </div> </div>  <div id="sessionchooser" class="w3-modal" > <div class="w3-modal-content w3-card-4 w3-greyscale w3-theme-d4 w3-padding w3-bordered" onclick="document.getElementById('sessionchooser').style.display='none'"> <span onclick="document.getElementById('sessionchooser').style.display='none'" class="w3-button w3-display-topright">×</span> <p><a class="w3-text" href="#Keynote: Els Kindt">Keynote: Els Kindt</a></p> <p><a class="w3-text" href="#Speaker Recognition I">Speaker Recognition I</a></p> <p><a class="w3-text" href="#Language Recognition">Language Recognition</a></p> <p><a class="w3-text" href="#Speaker diarization">Speaker diarization</a></p> <p><a class="w3-text" href="#Noise Robustness">Noise Robustness</a></p> <p><a class="w3-text" href="#Keynote: Simon King">Keynote: Simon King</a></p> <p><a class="w3-text" href="#Voice conversion">Voice conversion</a></p> <p><a class="w3-text" href="#Voice conversion and spoofing">Voice conversion and spoofing</a></p> <p><a class="w3-text" href="#Spoofing">Spoofing</a></p> <p><a class="w3-text" href="#Keynote: Pascal Belin">Keynote: Pascal Belin</a></p> <p><a class="w3-text" href="#Speaker recognition II">Speaker recognition II</a></p> <p><a class="w3-text" href="#Text-dependent speaker recognition">Text-dependent speaker recognition</a></p> </div> </div> <script> function myFunction() { var x = document.getElementById("smallnav"); if (x.className.indexOf("w3-show") == -1) { x.className += " w3-show"; } else { x.className = x.className.replace(" w3-show", ""); } } // Get the modal var modal = document.getElementById('sessionchooser'); // When the user clicks anywhere outside of the modal, close it window.onclick = function(event) { if (event.target == modal) { modal.style.display = "none"; } } $(document).ready(function() { $('#paper_table').DataTable( { data: [['Els Kindt', 'Speaker identification and Data protection', 'kindt18_odyssey', 'envisaged regulation encompasses biometric paid new reach use ass relevant'], ['Simoin King', 'Speaking naturally? It depends who is listening.', 'king18_odyssey', 'human machine countermeasure like adversarial image race anything putting arm'], ['Pascal Belin', 'A Vocal Brain: Cerebral Processing of Voice Information', 'belin18_odyssey', 'identity socially-relevant scared precious psychoacoustics neuroimaging wealth attractiveness routinely speaker-related'], ['Seyed Omid Sadjadi, Timothee Kheyrkhah, Audrey Tong, Craig Greenberg, Douglas Reynolds, Elliot Singer, Lisa Mason, Jaime Hernandez-Cordero', 'The 2017 NIST Language Recognition Evaluation\t', 'sadjadi18_odyssey', 'lre afv system log-likelihood performance data top development similar performing'], ['Ziqiang Shi, Mengjiao Wang, Liu Liu, Huibin Lin, Rujie Liu', 'A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification\t', 'shi18_odyssey', 'j-vectors impostor explicitly eer multi-faceted short-duration model rsr heterogeneous wrong'], ['Xiaohai Tian, Junchao Wang, Haihua Xu, Eng-Siong Chng, Haizhou Li', 'Average Modeling Approach to Voice Conversion with Non-Parallel Data\t', 'tian18_odyssey', 'parallel target feature model require source-target vcc speaker linguistic present'], ['Yichiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda', 'The NU Non-Parallel Voice Conversion System for the Voice Conversion Challenge 2018', 'wu18_odyssey', 'tt vcc spoke around parallel collapsed similarity reference datasets nagoya'], ['Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng', 'The HCCL-CUHK System for the Voice Conversion Challenge 2018\t', 'liu18_odyssey', 'stftms ppgs vcc non-parallel slow synthesize griffin-lim naturalness dblstm similarity'], ['Fahimeh Bahmaninezhad, Chunlei Zhang, John Hansen', 'Convolutional Neural Network Based Speaker De-Identification', 'bahmaninezhad18_odyssey', 'voice average identity evaluation naturalness concealing subjective objective gender-dependent eers'], ['Fred Richardson, Pedro Torres-Carrasquillo, Jonas Borgstrom, Douglas Sturim, Youngjune Gwon, Jesus Villalba, Jan Trmal, Nanxin Chen, Reda Dehak, Najim Dehak', 'The MIT Lincoln Laboratory / JHU / EPITA-LSE LRE17 System\t', 'richardson18_odyssey', 'mitll sub-systems submission cavg nist backend condition x-vector augmented i-vector'], ['Andreas Nautsch, Sergey Isadskiy, Jascha Kolberg, Marta Gomez-Barrero, Christoph Busch', 'Homomorphic Encryption for Speaker Recognition: Protection of Biometric Templates and Vendor Model Parameters\t', 'nautsch18_odyssey', 'privacy service encrypted data latest operator architecture template i-vector employing'], ['Kentaro Sone, Shinji Takaki, Toru Nakashika', 'Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model\t', 'sone18_odyssey', 'drm dnns dnn-based gmms utterance image gmm-based source classify target'], ['Md Hafizur Rahman, Ivan Himawan, David Dean, Clinton Fookes, Sridha Sridharan', 'Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification\t', 'rahman18_odyssey', 'in-domain domain out-domain mismatch compensated i-vectors data target eer dicn'], ['Shihono Mochizuki, Sayaka Shiota, Hitoshi Kiya', 'Voice liveness detection using phoneme-based pop-noise detector for speaker verification\t', 'mochizuki18_odyssey', 'vld spoofing attack method replay conventional microphone speech replayed vulnerability'], ['Moez Ajili, Solange Rossato, Dan Zhang, Jean-Fran莽ois Bonastre', 'Impact of rhythm on forensic voice comparison reliability', 'ajili18_odyssey', 'fvc intra-speaker variability support rhythmic process aspr prosecution dna defence'], ['Weiwei Lin, Man-Wai Mak, Longxin Li, Jen-Tzung Chien', 'Reducing Domain Mismatch by Maximum Mean Discrepancy Based Autoencoders\t', 'lin18_odyssey', 'mmd multi-source dad domain-invariant plda degrade subset sub-domain adaptation training'], ['Weicheng Cai, Jinkun Chen, Ming Li', 'Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System', 'cai18_odyssey', 'pooling variable-length get level utterance self-attentive aggregating open-set angular accepts'], ['Berrak Sisman, Grandee Lee, Haizhou Li', 'Phonetically Aware Exemplar-Based Prosody Transformation\t', 'sisman18_odyssey', 'phonetic framework ppgs dictionary exemplar activation depends conversion matrix phone-dependent'], ['Ruth Aloni-Lavi, Irit Opher, Itshak Lapidot', "Incremental On-Line Clustering of Speakers' Short Segments\t", 'alonilavi18_odyssey', 'cluster belong clustered new arrives knn existing arrive k-nearest constantly'], ['Patrick Lumban Tobing, Yichiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda', 'NU Voice Conversion System for the Voice Conversion Challenge 2018\t', 'tobing18_odyssey', 'module parameter wavenet-based deep vcc vocoder utterance-wise nagoya speech speaker'], ['Giacomo Valenti, Adrien Daniel, Nicholas Evans', 'End-to-end automatic speaker verification with evolving recurrent neural networks\t', 'valenti18_odyssey', 'hand-crafted topology reliance asv report fixed impracticable investigation complexity evolves'], ['Chunlei Zhang, Shivesh Ranjan, John Hansen', 'An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification\t', 'zhang18_odyssey', 'pre-trained out-of curriculum vanilla relative based network plda enrollment validated'], ['Jen-Tzung Chien, Kang-Ting Peng', 'Adversarial Learning and Augmentation for Speaker Recognition\t', 'chien18_odyssey', 'i-vectors fake minimized gan discriminator plda class regularization generator optimization'], ['Mitchell Mclaren, Diego Cast谩n, Mahesh Kumar Nandwana, Luciana Ferrer, Emre Yilmaz', 'How to train your speaker embeddings extractor\t', 'mclaren18_odyssey', 'degraded recommendation network datasets fundamental good technology lay era wild'], ['Mitchell Mclaren, Mahesh Kumar Nandwana, Diego Cast谩n, Luciana Ferrer', 'Approaches to Multi-domain Language Recognition', 'mclaren18b_odyssey', 'lid backend sri multi-resolution calibration lre team normalized involves embeddings'], ['Georgina Brown', 'Segmental Content Effects on Text-dependent Automatic Accent Recognition\t', 'brown18_odyssey', 'successful sample classification contribute hypothesise sociophonetic focussing specific uncover northern'], ['Suwon Shon, Ahmed Ali, James Glass', 'Convolutional Neural Network and Language Embeddings for End-to-End Dialect Recognition\t', 'shon18_odyssey', 'siamese feature fbank system dataset mfccs linguistic augmentation energy similarity'], ['David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Daniel Povey, Sanjeev Khudanpur', 'Spoken Language Recognition using X-vectors', 'snyder18_odyssey', 'fixed-dimensional post-evaluation framework aggregate x-vector pooling network excellent bottleneck i-vectors'], ['Bharat Padi, Shreyas Ramoji, Vaishnavi Yeruva, Satish Kumar, Sriram Ganapathy', 'The LEAP Language Recognition System for LRE 2017 Challenge - Improvements and Error Analysis\t', 'padi18_odyssey', 'evaluation post lid submission modeling i-vector dialect dnn effort deep'], ['Liang He, Xianhong Chen, Can Xu, Jia Liu', 'Latent Class Model for Single Channel Speaker Diarization\t', 'he18_odyssey', 'soft hard kenny premature multi-objective ahc segment agglomerative neighbor database'], ['Martin Karu, Tanel Alum盲e', 'Weakly Supervised Training of Speaker Identification Models\t', 'karu18_odyssey', 'recording i-vectors fixed-dimensional surpassing provides dataset estonian method concentrate diarization'], ['Alicia Lozano-Diez, Oldrich Plchot, Pavel Matejka, Ondrej Novotny, Joaquin Gonzalez-Rodriguez', 'Analysis of DNN-based Embeddings for Language Recognition on the NIST LRE 2017\t', 'lozanodiez18_odyssey', 'embedding layer dnn lid i-vector add output configuration system whose'], ['Hossein Zeinali, Lukas Burget, Hossein Sameti, Honza Cernocky', 'Spoken Pass-Phrase Verification in the i-vector Space\t', 'zeinali18_odyssey', 'text-dependent phrase i-vectors scoring pass-phrases utterance liveness reddots bottle-neck stand-alone'], ['Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling', 'The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods\t', 'lorenzotrueba18_odyssey', 'spoke submitted speaker identity non-parallel crowdsourced task state-of-the-art edition optional'], ['Kazuhiro Kobayashi, Tomoki Toda', 'sprocket: Open-Source Voice Conversion Software\t', 'kobayashi18_odyssey', 'vcc technique using developed system baseline gmm vocoder-free trajectory-based datasets'], ['Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen', 'Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama鈥檚 voice using GAN, WaveNet and low-quality found data', 'lorenzotrueba18b_odyssey', 'spoofing asvspoof adversarial speech source publicly generative significantly conversion database'], ['Niko Brummer, Anna Silnova, Lukas Burget, Themos Stafylakis', 'Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model', 'brummer18_odyssey', 'gmes gplda propagate uncertainty product inner precision embeddings dot ivectors'], ['Xianhong Chen, Liang He, Can Xu, Yi Liu, Tianyu Liang, Jia Liu', 'VB-HMM Speaker Diarization with Enhanced and Refined Segment Representation\t', 'chen18_odyssey', 'fl scd neighbor emission probability inaccuracy refines information change bayes'], ['Ond艡ej Novotn媒, Old艡ich Plchot, Pavel Mat臎jka, Ladislav Mo拧ner, Ond艡ej Glembek', 'On the use of X-vectors for Robust Speaker Recognition', 'novotny18_odyssey', 'system dnn data dnn-based domain diverse embeddings nist microphone challenging'], ['Joaquin Gonzalez-Rodriguez, Alvaro Escudero, Diego de Benito-Gorr贸n, Beltran Labrador, Javier Franco-Pedroso', 'An Audio Fingerprinting Approach to Replay Attack Detection on ASVSPOOF 2017 Challenge Data\t', 'gonzalezrodriguez18_odyssey', 'fingerprint-based reddots trial genuine file replayed access complementarity scenario original'], ['Giacomo Valenti, H茅ctor Delgado, Massimiliano Todisco, Nicholas Evans, Laurent Pilati', 'An end-to-end spoofing countermeasure for automatic speaker verification using evolving recurrent neural networks\t', 'valenti18b_odyssey', 'generalisation anti-spoofing detection attack considerably neat staple appeal fitness bona'], ['Ville Vestman, Tomi Kinnunen', 'Supervector Compression Strategies to Speed up I-Vector System Development\t', 'vestman18_odyssey', 'ppca fefa asv supervised supervectors i-vectors analysis probabilistic gmm map-adapted'], ['Jose Patino, Ruiqing Yin, H茅ctor Delgado, Herv茅 Bredin, Alain Komaty, Guillaume Wisniewski, Claude Barras, Nicholas Evans, S茅bastien Marcel', 'Low-latency speaker spotting with online diarization and detection', 'patino18_odyssey', 'llss latency solution i-vectors publicly embeddings fuel embrace excel metric'], ['Anna Silnova, Pavel Matejka, Ondrej Glembek, Oldrich Plchot, Ondrej Novotny, Frantisek Grezl, Petr Schwarz, Lukas Burget, Jan Cernocky', 'BUT/Phonexia Bottleneck Feature Extractor\t', 'silnova18_odyssey', 'nns language software trained network technically provided tied-state noting recognition'], ['Trung Ngo Trong, Ville Hautamaki, Kristiina Jokinen', 'Staircase Network: structural language identification via hierarchical attentive units', 'trong18_odyssey', 'meta-information family external classification encoding target level encapsulated label enforced'], ['Mireia Diez, Lukas Burget, Pavel Matejka', 'Speaker Diarization based on Bayesian HMM with Eigenvoice Priors\t', 'diez18_odyssey', 'model represent preferably step distribution elegant address state jfa ivector'], ['Sergey Novoselov, Andrey Shulipa, Ivan Kremnev, Alexandr Kozlov, Vadim Shchemelinin', 'On deep speaker embeddings for text-independent speaker recognition\t', 'novoselov18_odyssey', 'extractor embedding softmax activation network metric similarity architecture classification angular'], ['Hossein Zeinali, Hossein Sameti, Themos Stafylakis', 'DeepMine Speech Processing Database: Text-Dependent and Independent Speaker Verification and Speech Recognition in Persian and English\t', 'zeinali18b_odyssey', 'text-prompted exploring text-independent large-scale conduct unique province biometrics appealing make'], ['H茅ctor Delgado, Massimiliano Todisco, Md Sahidullah, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Junichi Yamagishi', 'ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements\t', 'delgado18_odyssey', 'spoofing replay countermeasure attack backend asv assessment unpublished coefficient log-energy'], ['Tomi Kinnunen, Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Zhenhua Ling', 'A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment\t', 'kinnunen18_odyssey', 'vcc cm quality eers serf human speaker imperfection therein mimicry'], ['Tomi Kinnunen, Kong Aik Lee, Hector Delgado, Nicholas Evans, Massimiliano Todisco, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds', 't-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification\t', 'kinnunen18b_odyssey', 'prior asv eer asvspoof metric attack cm edition anti-spoofing ranking'], ['Old艡ich Plchot, Pavel Mat臎jka, Ond艡ej Novotn媒, Sandro Cumani, Alicia Lozano-Diez, Josef Slav铆膷ek, Mireia Diez, Franti拧ek Gr茅zl, Ond艡ej Glembek, Mounika Kamsali, Anna Silnova, Luk谩拧 Burget, Lucas Ondel, Santosh Kesiraju, Johan Rohdin', 'Analysis of BUT-PT Submission for NIST LRE 2017', 'plchot18_odyssey', 'dnn system architecture fixed provided set classifier post-evaluation bottle-neck development'], ['Alan Mccree, David Snyder, Greg Sell, Daniel Garcia-Romero', 'Language Recognition for Telephone and Video Speech: The JHU HLTCOE Submission for NIST LRE17\t', 'mccree18_odyssey', 'newest dnn fusion data discriminatively-trained multidomain system good holding development'], ['Jesus Antonio Villalba Lopez, Niko Brummer, Najim Dehak', 'End-to-End versus Embedding Neural Networks for Language Recognition in Mismatched Conditions\t', 'lopez18_odyssey', 'posterior back-end embeddings x-vectors domain probabilistic network log-likelihoods frame system'], ['Rosa Gonzalez Hautam盲ki, Anssi Kanervisto, Ville Hautamaki, Tomi Kinnunen', 'Perceptual Evaluation of the Effectiveness of Voice Disguise by Age Modification', 'hautamaki18_odyssey', 'speaker intended asv stereotype chronological child elderly target listener male'], ['Md Jahangir Alam, Gautam Bhattacharya, Patrick Kenny', 'Speaker Verification in Mismatched Conditions with Frustratingly Easy Domain Adaptation\t', 'alam18_odyssey', 'embeddings eer x-vectors plda adapt domain-adaptation test strategy edition data'], ['Akihiro Kato, Tomi Kinnunen', 'A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech', 'kato18_odyssey', 'problem pefac ptdb-tug gpe using noisex- pitch gross tracker contaminated'], ['Md Jahangir Alam, Gautam Bhattacharya, Patrick Kenny', 'Boosting the Performance of Spoofing Detection Systems on Replay Attacks Using q-Logarithm Domain Feature Normalization', 'alam18b_odyssey', 'q-log dft product log spectral equal power applying nonlinearity harmful']], stateSave: true, columnDefs: [ { targets: [0], className: 'dt-left', "mRender": function (data, type, full) { return '<a class="w3-text" href="' + full[2] + '.html' + '">' + full[1] + '<br><span class="w3-text w3-text-theme">' + full[0] + '</span></a>'; } }, { targets: [1, 2, 3], visible: false, }, ], "lengthMenu": [7, 10, 20, 50, 100, 200, 500], "pageLength": 50, "order": [[ 0, 'asc' ]], scrollY: '60vh', "dom": '<"top"l>rft<"bottom"ip><"clear">', "pagingType": "full_numbers", paging: true }); }); </script> </body> </html>

CINXE.COM

ISCA Archive