Yapay sinir ağları tabanlı konuşmacı tanıma

İnal, Melih

dc.contributor.advisor	Bütün, Erhan
dc.contributor.author	İnal, Melih
dc.date.accessioned	2020-12-29T13:12:20Z
dc.date.available	2020-12-29T13:12:20Z
dc.date.submitted	2001
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/421202
dc.description.abstract	Anahtar Kelimeler: Yapay Sinir Ağlan, Eğiticili ve Eğiticisiz Öğrenme, Konuşmacı Tanıma, Metne Bağlı Kapalı Set Konuşmacı Tanıma, Metinden Bağımsız Açık- Kapalı Set Konuşmacı Tanıma. Özet: Bu çalışmada, çeşitli Yapay Sinir Ağlan (YSA) tabanlı Konuşmacı Tanıma uygulamalan gerçekleştirilmiştir. Çok Katmanlı Almaç (ÇKA) ve Kendi Kendini Organize Eden (SOM) Yapay Sinir Ağlan, eğiticili ve eğiticisiz öğrenme yöntemleridir. ÇKA ve SOM modelleri konuşmacı örüntüleri için sınıflandıncı olarak kullanılmıştır. Konuşmacı tanımada, özellik çıkartım önemli bir aşamadır. Bu çalışmada, özellik vektörlerinin çıkartımı için, Doğrusal Öngörüm Kodlama (DÖK) tabanlı çeşitli algoritmalar kullanılmıştır. Özellikle, kepstral katsayılar yöntemi en baskın algoritmadır. Çalışmalar, başlıca iki alanda incelenebilir: birincisi, çeşitli ÇKA mimarileri ile metne bağlı kapalı set konuşmacı tanıma ve ikincisi, SOM mimarileri ile metinden bağımsız açık-kapalı set konuşmacı tanıma uygulamalandır. Konuşmacı saptama uygulamalannda, SOM ağlarının çıkışında, karar birimi olarak, Birleştirilmiş Bellek Modeli (BBM) kullanılması amaçlanmıştır. İlk alanda yapılan çalışmalarda, 10 konuşmacının yer aldığı ad ve soyadlannı telaffuz ettikleri, Türkçe konuşmacı seti kullanılmıştır. Her telaffuz 8 kez tekrarlanarak, 5 tanesi eğitim, 3 tanesi de test aşamasında kullanılmıştır. Konuşmacı sayısı ve telaffuz edilen kelime sayısı arttıkça, her konuşmacı için ÇKA sınıflandıncısının oluşturulması ve eğitimi çok uzun zaman alır. Aynca sistemin tanıma verimi orantılı olarak düşer. ÇKA sınıflandıncısının bir diğer dezavantajı ise belirli bir problem için, optimum ağ mimarisinin, deneme ve yanılma yoluyla bulunmasıdır. İkinci alanda yapılan çalışmalarda, farklı SOM sımflandıncılan, Türkçe konuşmacı setinin eğitimi ve test edilmesi için, kullanılmıştır. SOM, ÇKA modeli ile karşılaştırdığında, her bakımdan daha iyi sonuç vermiştir. Daha sonra, SOM mimarileri, TIMIT veritabam için, yine sınıflandıncı şeklinde kullanılmıştır. Yaptığımız çalışmalar, TMIT veritabamnı kullanan diğer çalışmalarla karşılaştınldığmda, diğer çalışmalar kadar iyi sonuç vermiştir.
dc.description.abstract	Keywords: Artificial Neural Networks, Supervised and Unsupervised Learning, Speaker Recognition, Text Dependent Closed Set Speaker Recognition, Text Independent Open-Closed Set Speaker Recognition. Abstract: In this study, Various Artificial Neural Networks (ANN) based Speaker Recognition Applications are realized. Multilayer Perceptron (MLP) and Self Organizing Map (SOM) ANN are methods of the supervised and unsupervised learning scheme. MLP and SOM models are used as classifiers for speaker's patterns. Feature Extraction is an important stage in the speaker recognition. In this study, Linear Prediction Coding (LPC) based various algorithms are used for extraction of the feature vectors. Especially cepstral coefficients method is the most satisfied algorithm. Studies can be examined in two major areas: first one is the text dependent closed set speaker recognition with various MLP architectures and second is text independent open-closed set speaker recognition with SOM architectures. At the SOM outputs, use of Associative Memory Model (AMM) as decision unit is proposed for the speaker identification applications. In the first area Turkish speaker set is used and constituted by the 10 speakers with their name and surname. Each utterance is repeated 8 times, 5 of them is used in training and remaining in the test stage. When the number of words and speakers in the set increase, the MLP classifier would take too long to build and train. Also the recognition rate is dropped proportionally. Another weakness of MLP recognizers is the network architecture that is optimal for a specific problem should be found by trail and error. In the second area, different SOM architectures are used as classifier for training and testing Turkish speaker set. When SOM is compared with MLP, SOM is found better than MLP in all aspects. And then SOM architectures are used again as classifier for TBVIIT database. When our study is compared with different studies for TIMIT database, our studies give good results as much as the others.	en_US
dc.language	Turkish
dc.language.iso	tr
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Elektrik ve Elektronik Mühendisliği	tr_TR
dc.subject	Electrical and Electronics Engineering	en_US
dc.title	Yapay sinir ağları tabanlı konuşmacı tanıma
dc.title.alternative	Artificial neural networks based speaker recognition
dc.type	doctoralThesis
dc.date.updated	2018-08-06
dc.contributor.department	Diğer
dc.subject.ytm	Speaker recognition
dc.subject.ytm	Linear predictive coding
dc.subject.ytm	Artificial neural networks
dc.identifier.yokid	115381
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	KOCAELİ ÜNİVERSİTESİ
dc.identifier.thesisid	105929
dc.description.pages	102
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_115381.pdf
Size:: 4.752Mb
Format:: PDF
Description:: File_115381

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess