Genetik algoritma ve K-en yakın komşu kullanarak metin belgelerinin sınıflandırılması

Laribi, Parisa

dc.contributor.advisor	Saraçoğlu, Rıdvan
dc.contributor.author	Laribi, Parisa
dc.date.accessioned	2020-12-10T11:15:55Z
dc.date.available	2020-12-10T11:15:55Z
dc.date.submitted	2018
dc.date.issued	2018-11-29
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/257906
dc.description.abstract	Metin Madenciliği büyük miktardaki metinsel verilerden, önceden bilinmeyen bilgilerin elde edilmesini amaçlayan veri madenciliğinin bir dalıdır. Sınıflandırma, kümeleme ve tahmin, Metin Madenciliğinin önemli bir parçasıdır. Başarılı bir Metin Madenciliği yine başarılı bir sınıflandırma işlemine bağlıdır. Sınıflandırma sisteminin başarısını ve verimini artırmak için genellikle boyut azaltma işlemi gerçekleştirilir. Bu çalışmada metin belgelerinin sınıflandırılmasında boyut azaltma işlemi gerçekleştirilmiştir. Bunun için iki yöntem kullanılmıştır. Bunlardan ilki özellik çıkarımı, diğeri ise özellik seçimidir. Özellik çıkarımı için Temel Bileşen Analizi yöntemi kullanılmıştır. Özellik seçiminden sonra seçilen özellikleri için katsayı ile ağırlıklandırma kullanılmıştır. Özellik seçimi aşaması için ve özellik çıkarımından sonra en iyi kat sayıların seçimi için Genetik Algoritma kullanılmıştır. Deneysel sonuçlara göre özellik seçimi sınıflandırma başarısını kısmen azaltmıştır. Özellik çıkarımı ve bu aşamadan sonra eklenen katsayı ağırlıklandırma işlemi sınıflandırma başarısını önemli ölçüde artırmıştır.
dc.description.abstract	Text Mining is a branch of data mining that aims to obtain previously unknown information from large quantities of textual data. Classification, clustering and estimation are some important piece of Text Mining. An important part of a successful Text Mining is the successful classification process. Dimension reduction is usually performed to improve the success and efficiency of the classification system. In this study, the dimension reduction process was performed in the classification of text documents. Two methods have been used for this. One of them is feature selection and the other is feature extraction. Principial Component Analysis method is used for feature extraction. Weighting with coefficients is used for selected features after feature selection. Genetic Algorithm is used for the feature selection phase and for the selection of the best coefficients after feature extraction. According to the experimental results, the feature selection partially reduced the classification success. Feature extraction and coefficient weighting added after this step significantly increased the classification success.	en_US
dc.language	Turkish
dc.language.iso	tr
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Genetik algoritma ve K-en yakın komşu kullanarak metin belgelerinin sınıflandırılması
dc.title.alternative	Classification of text documents using genetic algorithm and K-nearest neighbors
dc.type	masterThesis
dc.date.updated	2018-11-29
dc.contributor.department	Elektrik-Elektronik Mühendisliği Anabilim Dalı
dc.subject.ytm	Genetic algorithms
dc.subject.ytm	Principal components analysis
dc.subject.ytm	Text categorization
dc.identifier.yokid	10210155
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	VAN YÜZÜNCÜ YIL ÜNİVERSİTESİ
dc.identifier.thesisid	520774
dc.description.pages	70
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10210155.pdf
Size:: 1.296Mb
Format:: PDF
Description:: File_10210155

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess