Journal of Engineering and Applied Sciences

91
Views

0
Downloads

Very Deep Convolutional Neural Network for Speech Recognition Based on Words

Javier O. Pinzon, Robinson Jimenez-Moreno, Oscar Aviles, Paola Nino and Diana Ovalle
Page: 6680-6685 | Received 21 Sep 2022, Published online: 21 Sep 2022

By Research Area

Medicine & Public Health Life Sciences Engineering Mathematics Biomedicine Physics Chemistry Computer Science Earth Sciences Social Sciences Business and Management Psychology Materials Science Economics Education Environment Philosophy Statistics Law Political Science and International Relations Pharmacy Dentistry Energy Linguistics Geography Finance Criminology and Criminal Justice Medicine Cultural and Media Studies History Architecture / Design Literature Biomedical Sciences Religious Studies Education & Language Food Science & Nutrition Public Health

By Volume and Issue

Abstract

This study presents the implementation of two very deep convolutional neural network architectures applied to speech recognition based on the usage of complete words for this case 12 specific words in order to evaluate their performance in two types of environments, one semicontrolled and another non-controlled. One of the architectures developed is based on the use of linear filters only in frequency while the other consists of linear filters in both frequency and time. It is proposed to use the power spectral density with its first and second derivatives as input of the network in order to strengthen the variety of feature maps that can be used in neural networks for speech recognition. Finally, in the tests performed in real time, the architecture with filters of frequency and time reaches an error rate of 16.67% in a semicontrolled environment while the other architecture obtained a 41.67%. This means that the architecture with the lowest error rate has better performance for word recognition, even with small databases and specialized in a particular group of people.

How to cite this article:

Javier O. Pinzon, Robinson Jimenez-Moreno, Oscar Aviles, Paola Nino and Diana Ovalle. Very Deep Convolutional Neural Network for Speech Recognition Based on Words.
DOI: https://doi.org/10.36478/jeasci.2018.6680.6685
URL: https://www.makhillpublications.co/view-article/1816-949x/jeasci.2018.6680.6685