Kalitelerine göre içeceklerin farklı algoritmalar ile sınıflandırılması

Er, Yeşim

View/Open

File_10137323 (6.879Mb)

Date

2017

Author

Er, Yeşim

Metadata

Show full item record

Abstract

Bu tez çalışması kapsamında, kalitelerine göre içecekler farklı algoritmalar kullanılarak sınıflandırılmaya çalışılmıştır. Çalışmada kullanılan veri seti, UCI (University of California at Irvine) Makine Öğrenmesi veri tabanından alınmıştır. Veri seti, farklı kalitelerdeki kırmızı ve beyaz şarap örneklerini içeren ayrı iki veri kümesinden oluşmaktadır. İlk olarak, bu iki veri kümesi birleştirilmiş ve şarap örnekleri kırmızı ve beyaz şarap olarak sınıflandırılmıştır. Sonra her bir veri kümesindeki faklı kalitelere sahip kırmızı (6 farklı kalite) ve beyaz (7 farklı kalite) şarap örneklerine ayrı olarak kalite sınıflandırılması yapılmıştır. Daha sonra, 13 farklı kaliteden oluşan kırmızı ve beyaz şarap örneklerine kalite ve renk sınıflandırılması yapılmıştır. Destek Vektör Makinesi, k-En Yakın Komşuluk ve Rastgele Orman bu çalışmada kullanılan sınıflandırma algoritmalarıdır. Veri setine boyut indirgeme yöntemlerinden Temel Bileşenler Analizi, özellik seçme yöntemlerinden filtre tabanlı (Bilgi Kazancı, Kazanım Oranı) ve sarmal tabanlı yöntemler uygulanarak sınıflandırma işlemleri tekrarlanmıştır. Veri setindeki dengesizlikler SMOTE (Synthetic Minority Over-sampling Technique) rastgele örnek arttırma ve örnek azaltma algoritmaları kullanılarak giderilmiş ve aynı sınıflandırma işlemleri tekrarlanmıştır. Çalışmada kullanılan performans ölçekleri duyarlılık, anma, F-ölçeği ve alıcı işletim karakteristiğidir.

In the scope of this thesis, beverages are classified according to their qualities using different algorithms. The data set used is taken from the UCI (University of California, Irvine) Machine Learning Database. The data set is composed of two different data sets that include white and red wine samples. Firstly, the two data sets were merged and the wine samples are classified as red and white. Then, classification process were employed separately on red (6 different qualities) and white wine samples (7 different qualities) that were of different qualities. After that, quality and colour classification were applied on 13 different qualities of red and white wine samples. Support Vector Machine, k-Nearest Neighbours and Random Forest are the classification algorithms that are used in this study. Principle Component Analysis were applied on the data set for reduction purposes, as well as filter-based (Information Gain, Gain Ratio) and spiral-based methods for feature selection; and with using these, the classification process were repeated. The imbalance in the data set were eliminated by using SMOTE (Synthetic Minority Over-sampling Technique) random over-sampling and under-sampling algorithms and the same classification processes were re-applied. The performance measures that are used in the study are the precision, recall, F-measure and receiver operating characteristic.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/477582

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess