files/journal/2022-09-02_12-54-44-000000_354.png

Journal of Engineering and Applied Sciences

ISSN: Online 1818-7803
ISSN: Print 1816-949x
113
Views
0
Downloads

Unsupervised Speaker Retrieval and Identification in Large Scale Environment

Rami Ammar, Assef Jaffar and Kadan Aljoumaa
Page: 2457-2463 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

The identity vector is one of the state-of-the-art techniques for building speaker identification and retrieval systems. These systems are used in many crucial applications. Recently, mainly due to the facilities in audio content acquisition, the need to analyzing unlabeled datasets has become a vital advantage. Our contribution is to enhance the identity vector approach by using k-means++ instead of using the random initial state of the universal background model “UBM”, this randomness may lead to a local minimum. This enhancement increased the accuracy of the system and decreased the needed number of epochs, thus, decreased the training time. In addition, we presented a study of the effect of changing the voice information extraction and the UBM parameters also we enhanced the performance of the system by using dimensionality reduction for identity vectors through using a deep autoencoder. Finally, we enhanced the well-known “SideKit” toolkit to work on large datasets in batches. We used a large dataset obtained under different conditions “VoxCeleb1”. VoxCeleb1 is a free and well-known dataset was recorded in real-world conditions.


How to cite this article:

Rami Ammar, Assef Jaffar and Kadan Aljoumaa. Unsupervised Speaker Retrieval and Identification in Large Scale Environment.
DOI: https://doi.org/10.36478/jeasci.2020.2457.2463
URL: https://www.makhillpublications.co/view-article/1816-949x/jeasci.2020.2457.2463