Öğrencilerin PISA matematik başarılarının yordanmasında veri madenciliği yöntemlerinin karşılaştırılması

Koyuncu, İlhan

dc.contributor.advisor	Gelbal, Selahattin
dc.contributor.author	Koyuncu, İlhan
dc.date.accessioned	2020-12-29T13:51:09Z
dc.date.available	2020-12-29T13:51:09Z
dc.date.submitted	2018
dc.date.issued	2018-11-08
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/434471
dc.description.abstract	Bu çalışmanın amacı, PISA (2012) çalışmasına katılan öğrencileri, çeşitli özellikleri bakımından matematik başarılarına göre sınıflandırmada Naive Bayes, en yakın komşuluk, yapay sinir ağları ve lojistik regresyon analizlerinin sınıflandırma performanslarını örneklem büyüklüğü ve test verisi oranı açısından incelemektir. Araştırmanın evreni PISA (2012) uygulamasına katılan 15 yaş grubundaki öğrencilerdir. Hedef evren OECD ülkelerinden çalışmaya katılan ve ilgili değişkenlere ait kayıp verisi olmayan 62728 öğrencidir. Hedef evrenden yerine koyma yöntemiyle örneklem büyüklüğü için 500 (100 veri seti), 1000 (50 veri seti) ve 5000 (30 veri seti) kişilik 180 dosya oluşturulmuştur. Her bir örneklemden %11, %22, %33, %44 ve %55 oranında veri ile yöntemlerin performansları test edilmiştir. Verinin tek ve çok değişkenli analizlerin varsayımlarının ne düzeyde sağladığı kontrol edilmiştir. Her bir veri seti için test verisinin her defasında rastgele seçildiği 100 analiz gerçekleştirilmiştir. Değerlendirme ölçütleri olarak Kappa hata matrisi uyumu, ROC eğrisinin altında kalan alan ile doğruluk oranları ve standart sapma değerleri kullanırken manidar farklıkları da istatistiksel olarak test edilmiştir. Araştırma sonuçlarına göre, örneklem büyüklüğü arttıkça yöntemlerin sınıflandırma performansında artış görülürken, test verisi oranının artması yöntemlerin performanslarında farklı etkiler yaratmıştır. Naive Bayes yöntemi küçük örneklemlerde bile yüksek performans göstermiş, analizleri çok kısa sürede gerçekleştirmiş ve test verisi oranının değişiminden önemli düzeyde etkilenmemiştir. Lojistik regresyon analizi büyük örneklemlerde en etkili yöntem iken küçük örneklemlerde düşük performans göstermiştir. Yapay sinir ağları benzer bir eğilim gösterirken, genel olarak Naive Bayes ve lojistik regresyona göre daha düşük performans göstermiştir. Tüm koşullarda en düşük performanslar en yakın komşuluk yöntemi ile elde edilmiştir. Genel olarak, öğrencilerin matematik performanslarına göre sınıflandırılmasında yüksek doğruluk değerleri elde edilmiştir. Araştırmanın sonuç ve öneriler bölümünde bulgular detaylı bir şekilde ele alınarak teori ve uygulamaya yönelik bazı önerilerde bulunulmuştur.
dc.description.abstract	The purpose of this study is to examine the performance of Naive Bayes, nearest neighborhood, artificial neural networks, and logistic regression analysis in terms of sample size and test-data ratio in classifying students participated in the PISA (2012) study according to their mathematics performance. The population is students in the 15-year-old group who are participated in the PISA (2012) study. The target population is 62728 students from OECD countries who have participated in the study and have no missing data for the relevant variables. A total of 180 datasets were created by selecting from the target population for the sample sizes including 500 (100 datasets), 1000 (50 datasets) and 5000 (30 datasets) students. The performance of each algorithm was tested by using 11%, 22%, 33%, 44% and 55% of each dataset. It has been checked to what extent the assumptions of the univariate and multivariate analyzes satisfy. For each dataset, 100 analyzes in which test-sample is randomly selected at each time were performed. As the evaluation criteria, accuracy rates and their standard deviations, Kappa values and the area under ROC curve were used. For each dataset, methods' means of accuracy rates and their standard errors were statistically tested. According to the results of the study, while the classification performance of the methods increased as the sample size increased, the increase of the test-data ratio had different effects on the performance of the methods. The Naive Bayes method showed high performance even in small samples, performed the analyzes very quickly and was not affected by the change in the test-data ratio. Logistic regression analysis was the most effective method in large samples, but had poor performance in small samples. While neural networks method showed a similar tendency, its overall performance was lower than Naive Bayes and logistic regression. The lowest performances in all conditions were obtained by the nearest neighbor method. In the conclusions and suggestions part of the present study, the findings were discussed in detail and some suggestions for theory and practice were made.	en_US
dc.language	Turkish
dc.language.iso	tr
dc.rights	info:eu-repo/semantics/embargoedAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Eğitim ve Öğretim	tr_TR
dc.subject	Education and Training	en_US
dc.title	Öğrencilerin PISA matematik başarılarının yordanmasında veri madenciliği yöntemlerinin karşılaştırılması
dc.title.alternative	Comparison of data mining methods in predicting PISA mathematical achievements of students
dc.type	doctoralThesis
dc.date.updated	2018-11-08
dc.contributor.department	Eğitim Bilimleri Anabilim Dalı
dc.subject.ytm	Artificial neural networks
dc.subject.ytm	Logistic regression analysis
dc.subject.ytm	Mathematics teaching
dc.subject.ytm	Mathematics achievement
dc.subject.ytm	Mathematics
dc.subject.ytm	Programme for International Student Assesment
dc.subject.ytm	Performance
dc.subject.ytm	Student achievement
dc.identifier.yokid	10179868
dc.publisher.institute	Eğitim Bilimleri Enstitüsü
dc.publisher.university	HACETTEPE ÜNİVERSİTESİ
dc.identifier.thesisid	494325
dc.description.pages	163
dc.publisher.discipline	Eğitimde Ölçme ve Değerlendirme Bilim Dalı

Files in this item

Name:: yokAcikBilim_10179868.pdf
Size:: 7.210Mb
Format:: PDF
Description:: File_10179868

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess