Grafik programlama kullanarak kepstrum analizi ve yapay sinir ağı ile konuşmacı tanıma

Özhan, Orhan

dc.contributor.advisor	Pastacı, Halit
dc.contributor.author	Özhan, Orhan
dc.date.accessioned	2020-12-29T10:32:43Z
dc.date.available	2020-12-29T10:32:43Z
dc.date.submitted	1999
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/393089
dc.description.abstract	ÖZET Veri sınıflama amacıyla Kendi Kendini Düzenleyen bir tek Harita (SOM) yaygın bir şekilde denetleyicin bir eğitim ile birlikte kullanılır. Konuşmacı tanıma için bu çalışmada denetleyicin bir eğitim olmadan %98 doğrulukla ayrık SOM'Iar kullanılmıştır. Metinden bağımsız konuşmacı tanımayı gerçekleştirmek için hem eğitimde hem de tanımada ayrık SOM'Iar kullanıldı. TIMIT konuşma veri tabanından farklı yaş, eğitim ve lehçe gruplarına mensup oniki konuşmacı seçildi. İlk eğitimden sonra denetleyicin bir eğitim kullanılmadı. Ünlü seslerden elde edilen temel peryod T0 ve iki ila yirminci kepstrum katsayıları öznitelik vektörü olarak seçildi. 10 ms aralıklarla üst üste binmiş 32 ms uzunluktaki konuşma çerçevelerinden kepstrum katsayıları elde edildi. Konuşmacı tanımada en iyi glotal uyanm katkısını belirlemek için, deneyler sırasında kepstrum katsayılarına dokunmadan temel peryod skala faktörü değiştirildi. Kadın konuşmacılarda çok miktarda yarım peryod seslendirmesi ile karşılaşıldı. Yanm peryod seslendirmesinden kaynaklanan hatah T» değerlerinden sakınmak için iki geçişli bir öznitelik çıkarma yöntemi tasarlandı. Birinci geçiş sırasında yeni bir diferansiyel temel peryod çıkarıcısı normal perde ile birlikte görülen yarım perde etkisini ortadan kaldırdı. İkinci geçişte fizyolojik olarak olanaksız görülen veri, istatistik yöntemle ayıklandı. En iyi tanıma için glotal uyarım katkısı 0.0287e olarak bulundu. Bu uyarımla %98.3 tanıma skoru elde edildi. Öznitelik çıkarma ve konuşmacı tanıma LabView grafik dili kullanılarak gerçekleştirildi. GUI ortamında yapılan Arge sırasında Lab VIE W nun yazılı programlamadan daha efektif ve değerli bir araç olduğu görüldü. Anahtar kelimeler: Metinden bağımsız konuşmacı tanıma, SOM, grafik programlama, yarım perde seslendirmesi, glotal uyarım, temel peryod çıkarma, kepstrum analizi XII
dc.description.abstract	ABSTRACT A single Self-Organizing Map (SOM) is commonly followed by supervised training for data classification purposes. In this work individual SOM's were utilized without supervised training with 98% accuracy for text-independent speaker identification. Individual SOM's were trained and used to identify speakers independently of the speech being uttered. To realize text-independent speaker identification, twelve speakers belonging to different age, education and dialect groups were selected from TIMIT speech database. No supervised training was used after initial SOM training. Scaled fundamental period T0 and second through twentieth cepstral coefficients obtained from voiced speech were selected as the feature vector. Cepstrum coefficients were obtained from 32 ms speech frames overlapping every 10 ms. Keeping cepstral coefficients intact, fundamental period scale was varied throughout the experiments to determine the best contribution of glottal excitation to speaker identification. Female speech was observed to manifest a lot of pitch halving. To avoid erroneous J0 values due to pitch halving, a two-pass feature extraction scheme was devised. During the first-pass a novel differential pitch extractor eliminated the effects of half-pitch sounds occuring with normal-pitch sounds. The second-pass statistically eliminated outliers that are physiologically implausible. Glottal contribution for best identification was found to be 0.02870 for which an identification score of 98.3% was attained. Feature extraction and speaker identification were implemented using LabVTEW graphical language. During research and development on graphical user interface platforms, LabVIEW proved more efficient and valuable than text-based programming. Keywords: Text-independent speaker identification, SOM, graphical programming, pitch halving, glottal excitation, pitch extraction, cepstrum analysis Xlll	en_US
dc.language	Turkish
dc.language.iso	tr
dc.rights	info:eu-repo/semantics/embargoedAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Elektrik ve Elektronik Mühendisliği	tr_TR
dc.subject	Electrical and Electronics Engineering	en_US
dc.title	Grafik programlama kullanarak kepstrum analizi ve yapay sinir ağı ile konuşmacı tanıma
dc.title.alternative	Speaker identification with cepstrum analysis and artificial neural network using graphical programming
dc.type	doctoralThesis
dc.date.updated	2018-08-06
dc.contributor.department	Diğer
dc.subject.ytm	Graphic programming
dc.subject.ytm	Speech recognition
dc.subject.ytm	Cepstrum analysis
dc.subject.ytm	Artificial neural networks
dc.identifier.yokid	85071
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	YILDIZ TEKNİK ÜNİVERSİTESİ
dc.identifier.thesisid	85071
dc.description.pages	120
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_85071.pdf
Size:: 24.81Mb
Format:: PDF
Description:: File_85071

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess