Research School » Research School PLUS » Funded Projects » VIPs » VIPs - Nickel/Kolossa

Speech Enhancement

Prof. Robert M. Nickel
Bucknell University Lewisburg


Prof. Dorothea Kolossa
Electrical and Computer Engineering

(copy 1)

First VIP Visit to the RUB (July 10th until August 4th, 2017):

Prof. Nickel worked at the Cognitive Signal Processing Research Group of Prof. Kolossa from July 10th until August 4th, 2017 (5 weeks total). He had been a guest researcher at the RUB several times before (see publication list above), yet this trip was his first visit within the scope of the VIP Grant of the RUB Research School PLUS. During his time at the RUB he did not only continue to work on new avenues for the estimation of uncertainty information from noisy speech features, but also on a new project in the area of language processing. The goal of this new work is the development of automatic and semi-automatic methodologies for textual authorship recognition and/or verification. Potential results are of great interest to the law enforcement community (in forensic text analysis, for example) as well as for data mining tasks in social media and/or online commerce platforms. The project requires machine learning techniques that bear similarities to the ones employed in certain types of uncertainty estimation. Within the project Prof. Nickel is serving as a co-supervisor to the Ph.D. candidate Benedikt Bönninghoff. Mr. Bönninghoff is involved in this work through an interdisciplinary research initiative that is part of the SecHuman NRW-Fortschrittskolleg funded by the North Rhine-Westphalian Ministry of Innovation, Science and Research (MIWF). Prof. Nickel met regularly with Mr. Bönninghoff to discuss the technical aspects of his current work and to explore possible future directions of his research. While Prof. Nickel was a guest at the RUB he also participated in several meetings with the members of the Cognitive Signal Processing Research Group and attended one of the so-called “tandem-meetings” of the SecHuman project. At the “tandem-meetings” researchers who serve as the principal investigators in the project from various fields (including education, journalism, linguistics, media science, peace research, as well as social science) come together to discuss the current state and future direction of the work of two participating Ph.D. candidates. In SecHuman a “tandem” consists of two Ph.D candidates from different areas (e.g. linguistics and engineering) who work together on the same research project. Prof. Nickel’s continued involvement in all of these endeavors is greatly appreciated due to his experience and expertise. The collaboration will continue while Prof. Nickel is residing back home in the US. The next visit is planned for the summer of 2018.


Prof. Robert M. Nickel received a Dipl.-Ing. degree in electrical engineering from the RWTH Aachen, Germany, in 1994, and a Ph.D. in electrical engineering from the University of Michigan, Ann Arbor, Michigan, in 2001. During the 2001/2002 academic year he was an adjunct faculty in the Department of Electrical Engineering and Computer Science at the University of Michigan. From 2002 until 2007 he was a faculty member at the Pennsylvania State University, University Park, Pennsylvania. Since the fall of 2007 he is a faculty member of the Electrical and Computer Engineering Department at Bucknell University, Lewisburg, Pennsylvania. At Bucknell he was promoted to Associate Professor in 2013. During the 2010/2011 academic year he was a Marie Curie Incoming International Fellow at the Institute of Communication Acoustics, Ruhr-Universität Bochum, Germany. Prof. Nickel is author/co-author of over 30 peer-reviewed scientific articles, mainly in the areas of speech enhancement, automatic speech recognition, speaker identification, and time-frequency analysis.


Our collaborative work focusses on the development of new algorithms for the estimation of speech feature uncertainties in the presence of acoustic background noise. Our studies in automatic speech recognition have shown that with the incorporation of such uncertainty information it is possible to dramatically improve the performance of the respective systems. So far, most uncertainty estimation methods are based on modifications of well developed techniques for speech enhancement. Speech enhancement and uncertainty estimation, however, are technically quite different in scope. In focusing specifically on uncertainty estimation we are aiming to develop new signal models for further improvements.


Project Related Publications:

 [1] “Inventory-Style Speech Enhancement with Uncertainty-of-Observation Techniques,” by R. M. Nickel, R. F. Astudillo, D. Kolossa, S. Zeiler, and R. Martin, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, March 25-30, 2012, pp. 4645-4648.


[2] “Inventory-Based Audio-Visual Speech Enhancement,” D. Kolossa, R. M. Nickel, S. Zeiler, and R. Martin, 13th Annual Conference of the International Speech Communication Association (INTERSPEECH), Portland, Oregon, September 9-13, 2012.


[3] “Corpus-Based Speech Enhancement with Uncertainty Modeling and Cepstral Smoothing,” R. M. Nickel, R. F. Astudillo, D. Kolossa, and R. Martin, IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, No. 5, May 2013, pp. 983-997.


[4] "Robust Audiovisual Speech Recognition Using Noise-Adaptive Linear Discriminant Analysis," by S. Zeiler, R. M. Nickel, N. Ma, G. J. Brown, and D. Kolossa, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 20-25, 2016, pp. 2797-2801.


[5] "Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR," by S. Gergen, S. Zeiler, A. H. Abdelaziz, R. M. Nickel, D. Kolossa, Proceedings of the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH), San Francisco, California, September 8-12, 2016, pp. 2135-2139.


[6] "Unsupervised Classification of Voiced Speech and Pitch Tracking Using Forward- Backward Kalman Filtering," by B. T. Bönninghoff, R. M. Nickel, S. Zeiler, and D. Kolossa, Proceedings of the 12th Speech Communication Conference of the Information Technology Society of the Association of German Engineers (ITG-VDE Fachtagung Sprachkommunikation), Paderborn, Germany, October 5-7, 2016.


[7] "Improving Audio-Visual Speech Recognition Using Deep Neural Networks With Dynamic Stream Reliability Estimates," by H. Meutzner, N. Ma, R. M. Nickel, C. Schymura, and D. Kolossa, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New Orleans, Louisiana, March 5-9, 2017.