Нутқ товушларини таниш алгоритмини ишлаб чиқиш

Нутқ товушларини таниш алгоритмини ишлаб чиқиш

№ 3 (51) 2019

Страницы:

19 –

29 Язык: узбекский

Аннотация

В данной статье описан один из способов распознавания слов узбекского языка. На первом этапе была создана спектрограмма речевого сигнала на основе обработки в частотной области. На втором этапе была выполнена параметризация на спектрограмме и проведен корреляционный анализ с использованием сравнения речевых звуков.

Ушбу мақолада ўзбек тили нутқ товушларини таниш усулларидан бири келтирилган. Биринчи босқичда нутқ сигналларини частота соҳасида қайта ишлаш асосида спектрограмма тасвирлари ҳосил қилинган. Иккинчи босқичда ҳосил қилинган спектрограмма тасвирларини дастлабки қайта ишлаш асосида параметрлаш ва улар асосида нутқ товушларини бир-бири билан солиштиришнинг корреляцион таҳлил амалга оширилган.

This article outlines the algorithm for speech recognition. The article outlines the stages of the speech spectrogram image. The speech spectrum image is the richest parameter that characterizes speech. High accuracy can be achieved by recognizing the speech in other ways by processing the resulting spectrogram image. Today, many speech recognition algorithms use the speech spectrogram. An important aspect of familiarity with the speech spectrogram is the transition from one-dimensional signal to two-dimensional. The main parameters characterizing the speech during the transition are separated using spectral transformation methods. As the basic parameters, the main tone frequency serves as a form. There are many effective ways to get to knowing speech, and most commonly, this speech is familiar to them through the processing of the spectrogram image. We have already mentioned the familiar ways of using point-to-speech sound, speeches of the speech spectrogram, and the steps to reproduce the image. The article discusses the topics of modern speech recognition algorithms, initial processing methods for speech signals, algorithm of speech spectrogram image generation, stages of spectrum image processing (filtration, site allocation,correlation comparison). Any speech (word or phrase) consists of small phonemes (letters or combination of letters). Given this fact, this article considers the most commonly used phonemes in the Uzbek language, namely, the development of six vowel letters.

Список использованных источников

А.И.Солонина, Д.А.Улохович и др. Основы цифровой обработки сигналов.: Курс лекций. Изд. 2-с исправ. и перераб. СПб.: БХВ – Петербург, 2005. - 768с.

Nicholas W.D. Evans, John S.Mason and Matt J.Roach ,“Noise Compensation using Spectrogram Morphological Filtering”, Speech and Image Research Group, Department of Electrical and Electronic Engineering University of Wales Swansea, UK.

Rohini R. Mergu Dr.Shantanu K. Dixit. Multi-Resolution Speech Spectrogram.International Journal of Computer Applications (0975 – 8887)Volume 15– No.4, February 2011.

Хужаяров И.Ш., Очилов М.М. “Нутқни қайта ишлаш масалаларини график процессорда амалга ошириш.” International conference on importance of information technologies in innovative development of real sectors of economy. April 5-6, 2018, Tashkent, Uzbekistan

Musaev M.M, Raximov M.F, Berdanov U.A. Parallel algorithms for acoustic processing of speech signals. : 2016 IEEE International Conference on Signal and Image Processing (ICSIP). 2017

Мусаев М.М. “Современные методы цифровой обработки речевых сигналов”. Научно-технический и информационно-аналитическая журнал ТУИТ, №.2(42)/2017, 2-13. Тошкент-2017

Berdanov U.A. “O’zbek tili nutqining qayta ishlashning korrelyatsion modeli tahlili”. TATUning ilmiy-texnika va axborot tahliliy jurnali,№.3(43)/2017, Toshkent-2017-с. 10-18.

Al-Darkazali, Mohammed. Image processing methods to segment speech spectrograms for word level recognition. Doctoral thesis (PhD),University of Sussex. 2017.

D. Polap, M. Woźniak. Image approach to voice recognition. 2017 IEEE Symposium Series on Computational Intelligence, SSCI 2017 - Proceedings.2018. pp.1-7.

J. M. Borst, "The Use of Spectrograms for Speech Analysis and Synthesis,"vol. 4, 1956.

B. Pinkowski, "Multiscale fourier descriptors for classifying semivowels in spectrograms," Pattern Recognition, vol. 30, p. 9, 1993.

B. Pinkowski, "Principal component analysis of speech spectrogram images,"Pattern Recognition, vol. 30, 1997. pp. 777-787.

X. Xiong, L. Jinyu, C. Eng Siong, L. Haizhou, and L. Chin-Hui, "A study on hidden Markov model's generalization capability for speech recognition,"in Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on, 2009, pp. 255-260.

W. B. Hussein, "Spectrogram Enhancement by Edge Detection Approach Applied To Bioacoustics Calls Classification," Signal & Image Processing:An International Journal, vol. 3, pp. 1-20, 2012.

L. D. Alsteris and K. K. Paliwal, Short-time phase spectrum in speech processing: A review and some experimental results, Digital Signal Process.17(3) (2007)-pp. 578–616.

J. Allen, “Short term spectral analysis, synthesis, and modification by discrete fourier transform,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 25, no. 3, , 1977. pp. 235–238.

Q. T. Nguyen et al., “Speech classification using sift features on spectrogram images,” Vietnam Journal of Computer Science, vol. 3, no. 4, ,2016. pp. 247–257.

John G. Proakis, Dimitris Manolakis: Digital Signal Processing: Principles,Algorithms and Applications, 4th edition, Pearson, USA, 2006.