Comparison of different algorithms for exploting the hidden trends in data sources

Özsevim, Emrah

dc.contributor.advisor	Püskülcü, Halis
dc.contributor.author	Özsevim, Emrah
dc.date.accessioned	2021-05-08T08:08:10Z
dc.date.available	2021-05-08T08:08:10Z
dc.date.submitted	2003
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/642849
dc.description.abstract	`Büyük Veri Gruplarındaki Gizli İlişkilerin Ortaya Çıkarılmasında Kullanılan Algoritmaların Karşılaştırılması* ÖZ Geniş ölçekli hareketli (transactional), zaman-serisi ve diğer türdeki veri tabanlarındaki büyüme, ilişki bulma (association rule) madenciliği konusunun yoğun işlem sürerinin üstesinden gelebilen etkili birçok algoritmanın geliştirilmesini beraberinde getirmiştir. Bu çalışmada, sırasıyla frequent itemsets, frequent patterns, closed frequent itemsets gibi gizli eğilimleri ortaya çıkarmada kullanılan Apriori, FP~ tree ve CHARM gibi çeşitli algoritmalar tartışılmakta ve performansları değerlendirilmektedir. Söz konusu algoritmaların performansları farklı (sentetik ve gerçek) veri grupları üzerinde test edilmiş ve çeşitli eşik (support) seviyeleri için ölçülmüştür. Algoritmalar veri hazırlama, madencilik, toplam çalışma performansları ve bilgi çıkarım yetileri açısından, karşılaştırılmıştır. İlişki bulma (association rule) madenciliğinin en temel algoritması olan Apriori, her seviyedeki frequent itemset grubunu bulma amacına yönelik olarak veri tabam üzerinde çoklu geçişler yapmaktadır. FP-tree algoritması bellekte az yer kaplayan FP-tree tabanlı bir madencilik yöntemi kullanarak tüm prefix paths, conditional pattern bases ve frequent patterns gruplarına ilişkin önemli bilgileri bulan ölçeklendirilebilir bir algoritmadır. CHARM, tüm frequent itemset grubunu ortaya çıkarmak yerine closed frequent itemset grubunu ortaya çıkarmanın yeterli olabileceğini kanıtlayarak mevcut ilişki bulma (association rule) madenciliği algoritmaları üzerine kayda değer gelişmeler ekleyen yepyeni bir algoritmadır. Deneysel sonuçlarımıza dayanarak, Apriori algoritmasının seyrek (sparse) veri grupları üzerinde iyi performans gösterdiği sonucuna varmış bulunmaktayız. FP- tree algoritması, Apriori algoritmasına kıyasla daha az ilişki bulmakla beraber, yoğun (dense) veri gruplarında düşük eşik (support) seviyelerinde de madenciliği mümkün kılan tek algoritmadır. Diğer taraftan, CHARM algoritması hem seyrek (sparse) hem de yoğun (dense) veri grupları üzerinde düşük eşik (support) seviyelerinde closed frequent itemset grubu (frequent itemset gurubun büyük bir kısmı ya da tamamı) hakkındaki bilgiyi çıkarmak için uygun bulunmuştur.
dc.description.abstract	ABSTRACT The growth of large-seale transactional databases, time-series databases and ofeer kinds, of databases has been giving, rise to the development of several efficient algorithms that cope with the computationally expensive task of association rule mining. Ife this study, different algorithms-, Apriori, EP-teee- and CHARM* for exploiting the hidden trends such as frequent, itemsets, frequent patterns* closed frequent itemsets respectively, were discussed and their performances were evaluated. The perfbmances of the algorithms were measured at different support levels, and the algorithms were tested on different date sets (on both synthetic aöd real data sete). The algorihmş were, compared, according, to their* data preparation performances mining performance, run time performances and knowledge extraction capabilities. The Apriori algorithm is the most prevalent algorithm of association rule lrimfng- which makes- multiple passes over- the- database aiming at findmg fee- set of frequent itemsets for each level. The FP-Tree algorithm is a scalable algorithm which finds the crucial information as regards the complete set of prefix paths, conditional pattern bases and frequent patterns by using a compact FP-Tree based mining method. The CHARM is a novel algorithm which brings remarkable improvement over existing, association rule mining, algorithms, by proving, the fact that mining the set of closed frequent itemsets is adequate instead of mining the set of all frequent itemsets. Related1 to our- experimental resultSi w& conclude feat fee Apriori algorithm demonstrates a good performance on sparse data, sets. The Fp-tree algorithm extracts less association in comparison to Apriori, however it is completelty a feasable solution feat facilitates mining dense data sets at low support levels. On the other hand, fee CHARM algorithm is an appropriate algorithm for mining closed frequent itemsets (a. substantial portion of frequent itemsets) on both sparse and dense, data, sets even at low levels of support.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/embargoedAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Comparison of different algorithms for exploting the hidden trends in data sources
dc.title.alternative	Büyük veri gruplarındaki gizli ilişkilerin ortaya çıkarılmasında kullanılan algoritmaların karşılaştırılması
dc.type	masterThesis
dc.date.updated	2018-08-06
dc.contributor.department	Diğer
dc.subject.ytm	Associate
dc.subject.ytm	Algorithms
dc.subject.ytm	Data mining
dc.identifier.yokid	139349
dc.publisher.institute	Mühendislik ve Fen Bilimleri Enstitüsü
dc.publisher.university	İZMİR YÜKSEK TEKNOLOJİ ENSTİTÜSÜ
dc.identifier.thesisid	134288
dc.description.pages	97
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_139349.pdf
Size:: 5.434Mb
Format:: PDF
Description:: File_139349

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess