Keyword spotting using hidden markov models

Duran, Şevket

View/Open

File_117857 (2.020Mb)

Date

2001

Author

Duran, Şevket

Metadata

Show full item record

Abstract

ÖZET SAKLI MARKOV MODELLERİ KULLANILARAK ANAHTAR KELİME YAKALAMA Anahtar kelime yakalama sisteminin amacı sürekli bir sesin içinde barınan küçük bir anahtar kelimeler gurubu ortaya çıkarmaktır. Bu sistemde önemli olan, kelime olmadığı halde hata verme oranını artırmaksızın olası en yüksek anahtar kelime bulma oranını elde etmektir. Bunun için sadece anahtar kelimeleri modelleme yapmak yeterli değildir. Anahtar kelimeleri, olmayanlardan ayırmak için, sözlük dışı kelimelerin modellemesi de gerekmektedir. Bu modelleme, yapısı ve türü itibarıyla tüm sistem performansı üzerinde büyük etkisi bulunan garbage modellemesi ile yapılmaktadır. Bu tezin konusu garbage modelleri olarak bağımsız içerikli sesbirimleri (monophone) incelemek ve sözlük dışı kelime dışlamaları için güvenilirlik oranlan bazında değişik kriterlerin performansım değerlendirmektir. Anahtar kelime yakalama ve telefon üzerinden yalıtılmış ses tanıma denemeleri için iki veritabanı oluşturuldu. Anahtar kelime bulma için en iyi performansı tek fazlı genel garbage modelleme ile birlikte tek-sesbirimsel modellerin kullanılması verdi. Güvenilirlik oranlan içinse süreleri ile ortalama sesbirim benzeşmelerinin birlikte kullanımı en iyi performansı gösterdi.

IV ABSTRACT KEYWORD SPOTTING USING HIDDEN MARKOV MODELS The aim of keyword spotting system is to detect a small set of keywords from a continuous speech. It is important to obtain the highest possible keyword detection rate without increasing the number of false insertions in this system. Modeling only keywords is not enough. To seperate keywords from non-keywords, models for out-of-vocabulary words are needed, too. Since the structure and type of garbage model has great effect on the entire system performance, out-of-vocabulary modeling is done by the use of garbage models. The subject of this MS thesis is to examine context independent phonemes as garbage models and evaluate the performance of different criteria as confidence measures for out-of-vocabulary word rejection. Two different databases are collected for keyword spotting and isolated word recognition experiments over telephone lines. For keyword spotting use of monophone models together with one-state general garbage model gives the best performance. Using average phoneme likelihoods with phoneme durations gives the best performance for confidence measures.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/79226

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess