Bird call detection using deep learning

Yüksel, Cihan

dc.contributor.advisor	Maşazade, Engin
dc.contributor.author	Yüksel, Cihan
dc.date.accessioned	2020-12-29T06:42:15Z
dc.date.available	2020-12-29T06:42:15Z
dc.date.submitted	2020
dc.date.issued	2020-08-27
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/337940
dc.description.abstract	Bu tezde, kuş sesi tespiti için farklı derin öğrenme yöntemlerini karşılaştırıyoruz. Bu amaçla, dijital sinyal işleme yöntemleri kullanılarak, birden fazla alandan kayıt içeren ses veri setlerimiz, mel spektrogram görüntüleri, mel frekanslı sepstral katsayıları (MFCC) veya gammaton frekanslı sepstral katsayıları olarak dönüştürülmektedir. Evrişimli sinir ağımız (CNN) için girdi katmanı, evrişim katmanı, normalizasyon katmanı, aktivasyon katmanı, havuz katmanı, tam bağlantılı katman ve sınıflandırma katmanı gibi farklı katmanlar kullanılmaktadır. Gri tonlamalı mel spektrogram görüntüleri, CNN'imizi katman boyutları, katman sayısı, giriş boyutları ve eğitim seçenekleri gibi farklı parametre ayarları ile eğitmek için kullanılmaktadır. Öte yandan, çıkarılan gammaton frekanslı sepstral katsayıları ve mel frekanslı sepstral katsayıları, tekrarlayan sinir ağı (RNN) dayalı çift yönlü ve tek yönlü uzun kısa süreli bellek ağlarının (LSTM) özellikleri olarak kullanılmaktadır. Hem MFCC hem de GTCC, basit bir sinir ağı algoritması için girdi olarak da kullanılmaktadır. Her iki uzun kısa süreli bellek ağımızda, karşılaştırma için farklı sayıda LSTM kullanılmaktadır. Algılamanın doğruluğu, alıcı çalışma karakteristikleri eğrisinin altındaki alanı (AUC) hesaplama metodu kullanılarak farklı parametreler için tüm yöntemler için doğrulanmaktadır.
dc.description.abstract	In this thesis, we compare different deep learning methods for bird sound detection. For this purpose, by using digital signal processing methods, our audio data sets containing recordings from multiple fields are turned into features as mel spectrogram images, mel frequency cepstral coefficients (MFCC) or gammatone frequency cepstral coefficients (GTCC). For our convolutional neural network (CNN), we use different layers such as input layer, convolution layer, normalization layer, activation layer, pooling layer, fully connected layer and classification layer. The gray scale mel spectrogram images are used to train our CNN for different parameter settings such as layer sizes, layer numbers, input sizes and training options. On the other hand, extracted gammatone frequency cepstral coefficients and mel frequency cepstral coefficients are used as features for recurrent neural network (RNN) based bidirectional and unidirectional long short term memory networks (LSTM). Both MFCC and GTCC are also used as input for a simple neural network algorithm. For both of our long short term memory networks, we use different number of LSTM for comparison. Accuracy of the detection is validated for all methods for different parameters using area under curve (AUC) of receiver operating characteristics.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/embargoedAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Elektrik ve Elektronik Mühendisliği	tr_TR
dc.subject	Electrical and Electronics Engineering	en_US
dc.title	Bird call detection using deep learning
dc.title.alternative	Derin öğrenmeyi kullanarak kuş ötüşü tespiti
dc.type	masterThesis
dc.date.updated	2020-08-27
dc.contributor.department	Elektrik-Elektronik Mühendisliği Anabilim Dalı
dc.identifier.yokid	10308049
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	YEDİTEPE ÜNİVERSİTESİ
dc.identifier.thesisid	632487
dc.description.pages	90
dc.publisher.discipline	Elektrik Elektronik Mühendisliği Bilim Dalı

Files in this item

Name:: yokAcikBilim_10308049.pdf
Size:: 1.589Mb
Format:: PDF
Description:: File_10308049

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess