Keyword search by symbolic indexing

Sari, Leda

View/Open

File_10099772 (861.2Kb)

Date

2016

Author

Sari, Leda

Metadata

Show full item record

Abstract

Anahtar sözcük arama (ASA) sisteminin amacı yazılı olarak verilen sorguların arşivlenmiş haber bültenleri, ses ya da video biçimindeki ders kayıtları, müşteri hizmetlerinin kayıt altına alınmış telefon görüşmeleri gibi sesli veriler içindeki yerlerinin saptanmasıdır. Mevcut en iyi ASA sistemleri otomatik konuşma tanıma (OKT) sistemi örülerini dizinlemeye dayanır. Fakat, yazılandırılmış konuşma verisi az olan dillerde, OKT sisteminin başarımı dolayısıyla da ASA başarımı düşer. OKT tabanlı sistemlerde diğer bir problem de OKT dağarcığında bulunmayan dağarcık-dışı (DD) sözcüklerin aranmasıdır. Genellikle kullanılan bir yöntem anahtar sözcüğü bir karışıklık modeliyle (KM) genişletip benzer kelimeleri de orijinal haliyle birlikte aramaktır. Bu çalışmada, ASA dizini verinin OKT tanıma örüsü gösterimi yerine verinin simgesel gösteriminden oluşturulmuştur. Bu simgeler OKT sisteminin derin yapay sinir ağı çıktısından oluşturulan arama verisi posteriorgramının kodlanmasıyla elde edilmiştir. IARPA Babel Programı'nın az kaynaklı dil verileri üzerinde yapılan deneylerde, önerilen sistemin OKT örüsü tabanlı mevcut bir ASA sistemiyle birleştirildiğinde terim ağırlıklı değer (TAD) ile ölçülen ASA başarımını özellikle DD sorgular için artırdığı gösterilmiştir. DD sözcüklerin aranmasında KM için doğrudan DD sorgularda TAD'yi enbüyüklemeyi hedefleyen bir ayırıcı eğitim yöntemi tanıtılmıştır. Ayırıcı eğitimin, kaynağı az olan dillerde, hem mevcut OKT tanıma örüsü hem de simgesel dizinlemeye dayalı ASA sistemlerine etkisi incelenmiştir.

The aim of keyword search (KWS) is to locate written queries in large amount of audio data such as archived news broadcasts, audio/video lectures, recorded customer call-center data or conversational speech. State of the art KWS approaches are based on indexing automatic speech recognition (ASR) lattices. However, for languages having only a limited amount of transcribed audio, the ASR performance decreases which in turn reduces the KWS performance. Another problem with ASR based KWS systems is searching for out-of-vocabulary (OOV) keywords which are not covered by the ASR vocabulary. One common approach is expanding the keyword using a confusion model (CM) and searching for similar words along with the original. In this work, the KWS index is generated using symbolic representations of the data instead of ASR lattices. These symbols are obtained by encoding the search data posteriorgram which is generated using the deep neural network (DNN) output of the ASR system. In the experiments performed on the low resource language datasets of the IARPA Babel Program, we show that when combined with existing ASR lattice based KWS systems, the proposed system improves the KWS performance measured in terms of term weighted value (TWV), especially for OOV queries. In order to handle OOV queries, a discriminative approach for training the CM is also introduced which directly aims at maximizing the TWV for OOV queries. We explore the influence of discriminative training on both an existing ASR lattice based system and the symbolic index based system under low resource settings.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/73408

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess