Gregor Rozinaj - Institute of Telecommunications

Publications
- Journals
- Year 2026
- Year 2025
- Year 2024
- Year 2023
- Year 2022
- Year 2018
- Year 2017
- Year 2016
- Year 2015
- Year 2014
- Year 2013
- Year 2012
- Year 2011
- Year 2010
- Year 2009
- Year 2008
- Year 2007
- Year 2006
- Year 2005
- Year 2004
- Year 2003
- Year 2002
- Year 2001
- Year 2000
- Year 1999
- Year 1998
- Year 1997
- Year 1996
- Year 1994
- Year 1993
- Year 1992
- Year 1991
- Year 1990
- Year 1988
- Year 1986
- Year 1984
Research Orientation
Research Projects
Thesis Supervising
Other Student Projects
PhD Students
International Cooperation
Inventions and Patents
Memberships
Some photos
Contact

Research Orientation

Speech recognition and analysis

Keywords:

Human Computer Interaction, Speech recognition, Speech synthesis, Data mining, Microphone array, Mobile, Java,

Description:

At the department we have built for over 7 years a good theoretical and practical background about all DTW, HMM and NN techniques and HTK and SPHINX systems as well. We managed to proceed from simple tasks like the recognition utilizing word models with the dictionary size of several dozen to the currently supported several thousand words modeled by tied context depended, speaker independent phoneme models. In our experiments and practical realizations we have used MASPER or REFREC 0.96 training procedures (utilizing HTK facility) producing various kinds of models either of speech units or non speech events. As the speech databases we make use of the Slovak SPEECHDAT and MOBILDAT databases trained and evaluated separately or even together producing hybrid models (fixed line and mobile models). Modifications to standard training procedures were made which resulted in the improvements to the overall results. Regarding the SPHPINX system, context depended as well as independent phoneme models were derived by the SphixTrain procedure modified for the Slovak language and MOBILDAT database. In both systems similar results regarding the accuracy were achieved compared to other researcher’s reports in well-known universities. For the practical application we use ATK software package and SPHINX 3.5 or 4 versions. The main achievement of our several years long effort was the successful construction of an recognition system being capable of recognizing about 1300 words in the real time. This application was furthermore incorporated into a more complex system that may serve as an information kiosk; currently 2 services are fully operable: train departure information and weather forecast. Moreover experiments with cross-language recognition were made, namely with Italian language. Except speech recognition we have been successfully dealing with other crucial analytical problems, especially: speech detection (we produced several VAD algorithms outperforming several well-know or experimental ones) and speaker identification. In the near future we contemplate to focus our attention to continuous speech recognition problem, which lies mostly in an application part of the recognition problem. This would encompass statistical language modeling, improvements to computational efficiency; phoneme based 2-layer recognition, etc. Next we would like to increase the robustness and the accuracy of HMM models mainly by the modifications to HMM model training, incorporation of other models and modifications to the speech feature extraction process. The hybrid HMM-NN approach is also very appealing.

Katedra telekomunikácií

Research Orientation

Speech recognition and analysis

Keywords:

Description: