Improved microphone array design with statistical speaker identification methods

Demir, Kadir Erdem

View/Open

File_10113615 (10.18Mb)

Date

2016

Author

Demir, Kadir Erdem

Metadata

Show full item record

Abstract

Mikrofon dizilerinin kazanc dizinin boyutlarn buyuturek artrlabilir fakat kazancartrmak icin sensor eklemek cok maliyetlidir. Bu nedenle eger ortamda yeterincealan olsa bile algoritma karsklgn artrarak kazanc artrma tercih edilir.Spektral dizi isleme methodlarnda, odaklanlmak istenen kisinin ve gurultununbulundugu posizyonlarn bilinmesi buyuk avantaj saglar. Geleneksel metodlarbu problemi istatiksel olmayan yontemlerle cozmeye calsr. Ayrca ses tanmametodlarnn performanslar gurultu orann yuksek oldugu ortamlarda azalr. Bugibi ortamlarda, mikrofon dizilerinin kullanlmas ses sinyalinin kalitesini artrr.Bu nedenlerde dolay, mikrofon dizileri ve ses tanma metodlar birbirlerine katksaglarlar.Bu calsmamzda, mikrofon dizisi sistemi ve ses tanma sistemi tek bir sisteminparcalar olarak tasarlanmstr. Mikrofon dizisi kullanarak ses tanma sisteminindogrulugu artlrken ses tanma sisteminin sonuclar kullanlarakta mikrofondizisinin kazanc artrlmstr. Ses tanma sistemi uygulumasnda Fusionve N-Gram temel frekans yontemleri onerilmistir Gelismis mikrofon tasarmngosterebilmek icin simulasyon ortam konusmaclarn odann herhangi bir yerineeklenebilicegi bir simulasyon ortam gelistirilmistir. Simulasyon ortamndadeneyler sonucu onerilen metodlarn geleneksel metodlar ustun oldugu gozlemlenmistir.

Conventional microphone array implementations aim to lock onto a source withgiven location and if required, tracking it. This implementation is straightforwardwhen the location or the path of the source and interference are provided. Itbecomes a challenge to detect the intended source when multiple unknown sourcesexist in the same environment.Performance of speaker identication degrades drastically when the speech signalis severely distorted by additive noise and reverberation. In such environments,microphone arrays are often utilized as a means of improving the quality of capturedspeech signals.Both microphone array and speaker identication are mature elds. The advancesof these two distinct elds can be combined into one system that maximizes gainon the intended speaker, which is the topic of this thesis. We utilize microphonearray methods to improve the accuracy of speaker identication in a cocktailparty environment. When the source and interferences are localized microphonearray can be tuned to further reduce noise and increase the gain.In this thesis we developed a robust simulation environment to demonstrate theproposed improved microphone array design with statistical speaker identication.This is an open source implementation in which users can assign speakersanywhere in the room. We proposed two features; fusion based, and computationallyecient N-Gram for speaker identication. We demonstrated that theproposed features and the algorithm that leverages the synergy of microphonearray processing and speaker identication methods outperforms conventional algorithms.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/93765

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess