Parçacık sürü optimizasyonu kullanarak makine öğrenmesi ile konuşma tanıma performansının artırılması

Mahmood, Arzo Mohammed Mahmood

View/Open

File_10284725 (1.325Mb)

Date

2019

Author

Mahmood, Arzo Mohammed Mahmood

Metadata

Show full item record

Abstract

Konuşma tanıma, insan sesinin bilgisayar tarafından algılanması olarak tanımlanmıştır. Konuşma tanıma öznel bir olgudur. Bu alanda birçok araştırmalar yapılmış olmasına rağmen hala birçok sorunla karşılaşılmaktadır. Bu alanda çeşitli ilerlemeler sağlanmıştır ve her amaç için farklı teknikler kullanılmaktadır. Bu tezde, Parçacık Sürüsü Optimizasyonu (PSO) ile k-en Yakın Komşuluk Algoritması (KNN), Destek vektör makinaları (DVM) ve yapay sinir ağları (YSA) sınıflandırma tekniklerini bir arada kullanılmıştır. Doğrusal Öngörüm Kodlaması (LPC), konuşma sinyali özelliklerinin çıkarılması için kullanılmaktadır ve destek vektör makinelerinin (SVM) sınıflandırmada önemli bir noktası olan öğrenme aşamasında sezgisel bir algoritma olan Parçacık Sürü Optimizasyonu (PSO) kullanılarak sınıflandırmanın başarısı arttırılmıştır.Bu tezde farklı yaştaki farklı kişilerin sesleri sessiz ve gürültüsüz bir ortamda kaliteli bir mikrofon ile kaydedilmektedir. Her biri 5 kelimeden (Back, go, left, right ve stop) oluşan ve içerisinde 12 kişinin bulunduğu bir veri seti kullanılmaktadır. Eğitim seti 60 örnekten test seti ise 40 örnekten oluşmaktadır ve kelime süresi 1 saniyedir. PSO ile optimize edilmiş konuşma tanıma sisteminde, SVM, KNN VE YSA olmak üzere ayrı ayrı üç sınıflandırma kullanılmıştır .Anahtar Kelimeler: Konuşma Tanıma, Lpc, Pso, Ses, Dvm, Ysa

Speech recognition is defined as the recognition of human voice by computer. Speech recognition is a subjective phenomenon, although many studies have been conducted in this field, many problems are still encountered. Various advances have been made in this field and different techniques are used for each purpose. In this study, Particle Swarm Optimization (PSO) and Support Vector Machine (SVM), knearest neighbors algorithm (KNN), and Artificial Neural Network(ANN) techniques are used together. Linear Prediction coding (LPC) is used to extract speech signal characteristics and Support Vector Machines (SVM), an important point in the classification of the learning stage, an intuitive algorithm, Particle Swarm Optimization (PSO) has been increased the success of the classification.The voices of different people of different ages are recorded with a quality microphone in a quiet environment. A data set of is used, each consisting of 5 words (Back, go, left, right and stop) and 12 people 60instance of training set and 40 instance of test set each word's duration is 1 second spoken by different people. In the PSO-optimized speech recognition system three different classifications,SVM,KNN and ANN were used, and if compared, the traditional classification used was better compared to SVM. Keywords: Ann, Lpc, Pso, Sound, Speech Recognıtıon, Svm.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/458481

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess