Low-Complexity Supervised Learning for Gesture and Shape Recognition

Çelebi, Sait

dc.contributor.advisor	Arıcı, Tarık
dc.contributor.author	Çelebi, Sait
dc.date.accessioned	2021-05-08T07:33:36Z
dc.date.available	2021-05-08T07:33:36Z
dc.date.submitted	2014
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/631678
dc.description.abstract	Sınıflandırma, verilen örnekleri özelliklerini kullanarak kategorize etme işini yapan makine öğrenmesi görevidir. Hareket Algılama (HA) ve Şekil Algılama (ŞA) iki adet sınıflandırma örneğidir. El Hareketlerini Algılama (EHA) ve Optik Karakter Tanıma (OKT) bu alanlardaki günlük hayatta karşılaşılan bazı uygulamalardır.EHA, genellikle insan-bilgisayar etkileşimi uygulamalarında kullanılan, insan ve bilgisayar arasında doğal bir arayüz sunan zor bir sınıflandırma problemidir. Aynı el hareketi farklı hızlarda uygulanabileceği için, Dinamik Zaman Bükmesi (DZB) iki tane zaman dizisi arasındaki en iyi uyuşmayı bulmak için kullanılır. Çoğu zaman referans ve test örneklerindeki farklılıklardan dolayı bir ön-işleme mekanizması gereklidir. Hareket tanımanın bu tip farklılıklardan bağımsız olarak iyi çalışabilmesi için birkaç ön-işleme metodu gereklidir. DZB, hali hazırda bulunan test örneğiyle tüm referans örneklerini tek tek tüm parçalarını uyuşturmaya çalışarak bir farklılık ölçütü hesaplar. Fakat bir el hareketini algılarken vücudun tüm parçalarının ağırlığı eşit değildir. Bu çalışmada vücut parçalarını bir farklılık oranını optimize ederek ağırlaklandırmayı öneriyoruz. Son olarak, ön-işleme ve ağırlıklandırma yöntemlerimizi klasik DZB ve tekniğin bilinen en iyi durumu ile kıyaslıyoruz.ŞA, OKT'den yaya algılamaya kadar uzanan artan sayıda uygulamalara sahip diğer bir sınıflandırma problemidir. Karar ağaçları uygulaması kolay olduğu için, görselleştirilebilmesi mümkün olduğu için ve hesaplama karışıklığı az olduğu ŞA için uygun bir sınıflandırıcı seçimidir. Eğer sınıflandırma için birden fazla birbiriyle az ilişkili karar ağacı beraber kullanılıyorsa (rastgele orman) sınıflandırma kalitesi artar. Bu çalışmada rastgele orman sınıflandırıcılarını resimlerden rastgele seçtiğimiz dikdörtgen özellikleriyle kullanıyoruz. Metodumuzu karakter tanıma ve hareket tanıma datasetleriyle test ediyoruz. Görülüyor ki bu yöntem şuana kadar bilinen en iyi yöntemlerle yaklaşık doğrulukta çalışmaktadır. Bunun yanında bunlara kıyasla çok daha hızlı çalışmaktadır ki bu özelliği bu yöntemi gerçek zamanlı nesne ve şekil tanıma uygulamalarına uygun kılmaktadır. Rastgele dikdörtgenler gibi basit tanımlayıcıların karışık istatistiksel ve yapısal tanımlayıcılara göre ne kadar da şaşırtıcı şekilde iyi çalıştığı üzerine tartışıyoruz. Son olarak da sistemde kullandığımız parametreleri analiz ediyoruz.
dc.description.abstract	Classification is a machine learning task in which the objective is to categorize given samples according to their attributes. Gesture Recognition (GR) and Shape Recognition (SR) are two classification examples. Some daily-life applications of these include Hand Gesture Recognition (HGR) and Optical Character Recognition (OCR).GR is a challenging classification problem often used in human-computer interaction applications to provide a natural interface between user and computer. Since the same gesture might be performed with different speeds, Dynamic Time Warping (DTW) is needed to find the optimal alignment between two time sequences. Oftentimes a pre-processing of sequences is required to remove variations between the reference gestures and the test gestures. We discuss a set of pre-processing methods to make the gesture recognition mechanism robust to these variations. DTW computes a dissimilarity measure by time-warping the sequences on a per sample basis by using the distance between the current reference and test sequences. However, all body joints involved in a gesture are not equally important in computing the distance between two sequence samples. We propose a weighted DTW method that weights joints by optimizing a discriminant ratio. SR is another classification problem with increasing number of applications from OCR to pedestrian detection. Decision tree is a good choice of classifier for shape recognition because it is easy to implement and visualize and has lower computational complexity. Bagging randomized decision trees as random forests increases the accuracy rates if the trees are weakly correlated. We propose using random rectangles in combination with random forests and test our method on OCR and GR datasets. We show that the accuracy of our method is similar to the OCR state-of-the-art and better than the GR state-of-the-art, while executing significantly faster, which makes our proposed method a good fit for real-time object/shape recognition. Then discuss how a simple feature such as a random rectangle can perform similar to the complex statistical and structural features designed for shape recognition. Finally we analyze the effect of our parameters.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Low-Complexity Supervised Learning for Gesture and Shape Recognition
dc.title.alternative	Hareket ve Şekil Tanıma için Az Karmaşıklıklı Gözetimli Öğrenme
dc.type	masterThesis
dc.date.updated	2018-08-06
dc.contributor.department	Elektronik ve Bilgisayar Mühendisliği Ana Bilim Dalı
dc.identifier.yokid	10032701
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	İSTANBUL ŞEHİR ÜNİVERSİTESİ
dc.identifier.thesisid	413242
dc.description.pages	53
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10032701.pdf
Size:: 6.222Mb
Format:: PDF
Description:: File_10032701

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess