Speech emotion recognition using auditory models

Yüncü, Enes

dc.contributor.advisor	Hacıhabiboğlu, Hüseyin
dc.contributor.advisor	Bozşahin, Hüseyin Cem
dc.contributor.author	Yüncü, Enes
dc.date.accessioned	2020-12-10T09:14:10Z
dc.date.available	2020-12-10T09:14:10Z
dc.date.submitted	2013
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/225494
dc.description.abstract	Bilişimsel teknolojinin ortaya çıkmasıyla, insan bilgisayar etkileşimi (İBE) basit mantıksal hesaplamaların ötesine geçti. Duygusal bilgi işlemleri insan bilgisayar etkileşimini kullanıcının ihtiyaçlarına göre adapte ederek geliştirmeyi amaçlamaktadır. Bu nedenle görsel, işitsel,dokunsal ve diğer biyometrik sinyalleri yakalayarak duyguları tesbit etmeyi hedeflemektedir.Duyguların insanların tecrübe edinimi ,dış dünya ile etkileşimi ve karar mekanizması üzerinde büyük bir etkisi bulunmaktadır. Duygular insan sosyal ilişkilerinin şekillenmesinde ve hayata dair önemli kararların alınmasında rol oynamaktadır. Bu nedenle duyguların algılaması yüksek düzeyde bir etkileşim için çok önemlidir. Her duygunun onları tanımamızı sağlayan benzersiz özellikleri bulunmaktadır. Aynı söyleniş yada cümlenin ürettiği akustik sinyaldeki değişimin sebebi öncelikle biyofiziksel değişikliklerdir (stres kaynaklı daralma, gırtlak gibi). Akustik sinyalindeki farklılıklar ve duygular arasında ilişki, konuşmadan duygu tanımayı duygusal bilgi işlemleri arasında çok çalışılan bir konu haline getirmiştir. Ana amacı, kayıt edilen bir konuşmadaki duygusal durumu duygu tanıma algoritması kullanarak tespit etmektir. İnsan işitme sistemi frekansa bağımlı filtreleme ve eş zamanlı maskeleme içeren doğrusal ve edinilmiş bir mekanizmadır. Duygusal konuşma yüksek kaliteli bir mikrofon kullanarak kaydedilip yüksek çözünürlüklü sinyal işleme teknikleri ile analiz edilebilirken, insan bir dinleyici ancak işitsel sisteminin ona sağladığı verileri kullanabilir. Bu tür duygusal verilere sınırlı erişimi de öznel duygu tanıma doğruluğunu azaltır. İnsan işitme sisteminin bir modelini temel alan bir konuşma duygu tanıma algoritması geliştirildi ve onun doğruluğu bu tez kapsamında değerlendirildi. İşitsel filtreleme tabanlı insan duyma modeli temiz konuşma sinyalleri işlemek için kullanıldı. Elde edilen çıktılardan basit özellikler çıkarıldı ve yedi farklı sınıftaki ( öfke, korku, mutluluk, üzüntü, tiksinme, can sıkıntısı ve nötr) duygular için ikili sınıflandırıcı eğitmek için kullanıldı. Geliştirilen sınıflandırıcı daha sonra tanıma performansını değerlendirmek için kullanıldı. Almanca, İngilizce ve Lehçe olmak üzere üç duygusal konuşma veritabanları, önerilen yöntem ile test edildi ve tanıma oranları %82 olarak belirlendi. Almanca veritabanı kullanılarak hazırlanan öznel tanıma testi sonuçlarının, geliştirilen otomatik konuşma duygu tanıma sistemi ile kıyaslanabilir olduğu tespit edildi.Anahtar Kelimeler: duygular, akustik, isitsel modeller, isitsel ?ltre dizgisi, destek vektor makinasi
dc.description.abstract	With the advent of computational technology, human computer interaction (HCI) has gone beyond simple logical calculations. Affective computing aims to improve human computer interaction in a mental state level allowing computers to adapt their responses according to human needs. As such, affective computing aims to recognize emotions by capturing cues from visual, auditory, tactile and other biometric signals recorded from humans. Emotions play a crucial role in modulating how humans experience and interact with the outside world and have a huge effect on the human decision making process. They are an essential part of human social relations and take role in important life decisions. Therefore detection of emotions is crucial in high level interactions. Each emotion has unique properties that make us recognize them. Acoustic signal generated for the same utterance or sentence changes primarily due to biophysical changes (such as stress-induced constriction of the larynx) triggered by emotions. This relation between acoustic cues and emotions made speech emotion recognition one of the trending topics of the affective computing domain. The main purpose of a speech emotion recognition algorithm is to detect the emotional state of a speaker from recorded speech signals. Human auditory system is a non-linear and adaptive mechanism which involves frequency dependent filtering as well as temporal and simultaneous masking. While emotion can be manifested in acoustic signals recorded using a high quality microphone and extracted using high resolution signal processing techniques, a human listener has access only to cues which are available to him/her via the auditory system. This type of limited access to emotion cues also reduces the subjective emotion recognition accuracy. A speech emotion recognition algorithm based on a model of the human auditory system is developed and its accuracy is evaluated in this thesis. A state-of-the-art human auditory filter bank model is used to process clean speech signals. Simple features are then extracted from the output signals and used to train binary classifiers for seven different classes (anger, fear, happiness, sadness, disgust, boredom and neutral) of emotions. The classifiers are then tested using a va lidation set to assess the recognition performance. Three emotional speech databases for German, English and Polish languages are used in testing the proposed method and recognition rates as high as 82% are achieved for the recognition of emotion from speech. A subjective experiment using the German emotional speech database carried out on non-German speaker subjects indicates that the performance of the proposed system is comparable to human emotion recognition.Keywords: emotions, acoustic, auditory model, auditory ?lterbank, support vector machine	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.subject	Elektrik ve Elektronik Mühendisliği	tr_TR
dc.subject	Electrical and Electronics Engineering	en_US
dc.subject	Psikoloji	tr_TR
dc.subject	Psychology	en_US
dc.title	Speech emotion recognition using auditory models
dc.title.alternative	İşitsel modelleri kullanarak otomatik duygu konuşma tanıma
dc.type	masterThesis
dc.date.updated	2018-08-06
dc.contributor.department	Bilişsel Bilim Anabilim Dalı
dc.identifier.yokid	10018059
dc.publisher.institute	Enformatik Enstitüsü
dc.publisher.university	ORTA DOĞU TEKNİK ÜNİVERSİTESİ
dc.identifier.thesisid	343114
dc.description.pages	86
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10018059.pdf
Size:: 919.8Kb
Format:: PDF
Description:: File_10018059

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess