Parallel implementation of a VQ-based text-independent speaker identification

Soğanci, Ruhsar

View/Open

File_169632 (2.580Mb)

Date

2004

Author

Soğanci, Ruhsar

Metadata

Show full item record

Abstract

Günümüzde otomatik kimlik tanıma sistemleri bilgisayar tabanlı uygulamaların vazgeçilmez bir parçasını teşkil etmektedir. Bu amaçla üretilen şifre ve anahtar gibi çözümler düşük maliyetleri ve kolay uygulanabilirlikleri ile oldukça yaygın kullanılmaktadır, ancak bu çözümler unutulma, kaybedilme ya da yetkisiz kişilerin eline geçebilme riski taşımaktadırlar, bu nedenle kişiye özel fiziksel karakteristiklerin belirlenmesi ve kullanılmasına dayalı biometrik çözümler üzerinde durulmaktadır. Biometrik çözümler iris taranmasından, parmak izi incelenmesine, DNA analizinden, imza ve yazı kontrolüne kadar oldukça geniş bir yelpazeye yayılmaktadır. Ses, bir mikrofon veya telefon aracılığı ile kolaylıkla sayısal veriye çevrilebilmesi nedeni ile oldukça popüler bir biometriktir. Bu çalışmada konuşan kişinin söylediği kelimelerden bağımsız olarak kimliğini tesbit etmeyi hedefleyen bir sistem sunulmaktadır. Sesin kişiye özel karakteristiklerinin belirlenmesinde Mel-Frekans Kepstrum sabitleri, bu karakteristiklerin modellenmesinde Linde-Buzo-Gray vektör niceliklendirmesi, iki modelin benzerliğinin değerlendirilmesinde Euclidean mesafesi kullanılmıştır. Ses örneklerinin anlamlı karakteristiklerinin karşılaştırılması, ve benzer olanlarının eşleştirilmesi önemli ölçüde dönüşümler ve karşılaştırmalar gerektirmektedir, bu nedenle oldukça fazla hafıza kullanımı ve disk erişimi söz konusudur. Tek bir bilgisayar üzerinde oluşan bu yük, paralel çalışan bir bilgisayar kümesine dağıtılarak daha hızlı sonuç üreten bir sistem oluşturulmuştur. 100 modelin 16 işlemci ile parallelizme dayalı eşleştirilmesi aynı methodu kullanan seri uygulamaya göre 13.8 kat hızlanma sağlamıştır.

Automatic user identification is an indispensable part of today's computer based applications. Passwords and keys are common solutions due to their ease of implementation and low cost, but these solutions also contain the risk of being forgotten, stolen, or being used by unauthorized users, therefore security professionals are working on biometric solutions that are based on human specific characteristics. Biometric solutions include a great range from iris-scan to finger-scan, from DNA analysis to signature and keystroke scan. Voice is a popular biometric as it can be easily collected and digitalized by a microphone set or by a phone. In this study a text-independent speaker identification system is presented. Mel- Frequency Cepstrum Coefficients are used in feature extraction, Linde-Buzo-Gray vector quantization is used in modeling these features, and measuring the similarity of models is achieved by using Euclidean distance metric. Comparing meaningful characteristics of voice samples requires a significant amount of transformations and calculations; therefore speaker recognition process results with large amount of memory usage and disk access. To share this load to a cluster system instead of using a serial machine, a parallel text- independent speaker identification system is implemented, and clear performance improvements are observed. Our parallel speaker recognition system achieves a speed up about 13.8 compared with its serial implementation in the case of using 16 processing elements to identify a corpus of 100 speakers.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/77437

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess