Concept based semantic web mining

Özişik, Alper

dc.contributor.advisor	Karahoca, Adem
dc.contributor.author	Özişik, Alper
dc.date.accessioned	2021-05-01T07:15:56Z
dc.date.available	2021-05-01T07:15:56Z
dc.date.submitted	2008
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/550783
dc.description.abstract	Şimdiki web arama teknolojileri, benzer sayfaları içerikleri ve bağlantı yapıları ile bulma konusunda iyiler. Buna rağmen, benzer sayfaları sözlük kelime ve çapraz dil karşılıklarının alakadarlıklarını bulma konusunda iyi değiller.Bu tez benzer sayfaların bulunmasına bilinen yöntemlerin kombinasyonu ile yoğunlaşıyor. Bağlantı toplama, anlamsal tanımlayıcı veri algılanması web içerik ve yapısal madenciliği için gereklidir. Bu tez, diğer web madenciliği tekniklerinden sözlük anlamları ve çapraz dildeki anlamlarını da içererek ayrılıyor. Web robotları tarafından toplanan tüm bu veriler, web madenciliği için veri tabanında dizinlenir.Dizinlenmiş veri, içindeki anlamsız kelimelerden ve yanlış yönlendirici sitelerden, mesela reklam sitelerinden, arındırılır. Temiz veri kümeleme veri madenciliği için işlenir. Bu işleme sırasında, sayfa ilişkilerine sayfa bağlantı seviye bilgisi ve içeriklerindeki kelimelerin kesişim değerlerini eklenir.Web madenciliği işlemi için, kümeleme algoritmalarının K-means ve EM metotları, hangisi daha iyi sonuç verecek diye karşılaştırıldı. Seçilen metot, kullanıcının başta seçmiş olduğu sayfa ile benzer sayfaları listeledi.
dc.description.abstract	Current web search technologies are good to find similar pages with their content and link structures. However they are not enough to find similar pages including word dictionary or cross-linguistic meaning relevance.This thesis focuses finding similar pages on web with combination of known techniques. Link gatherings, semantic web metadata parsing are required for Web content and structural mining. This thesis differs from other web mining methods with word dictionary meaning and cross-linguistic meanings. All of that information is processed by web crawlers and indexed on data for web mining.Indexed data is purified from non-useful words and misleading web sites, such as advertisement sites. Clean data is processed in clustering data mining. Data processing contains adding more information to page relations with link distance levels and content word joint values.For the web mining process, K-means and EM methods of clustering algorithms are compared to decide which one will have better results. Chosen method enlists similar pages to the page of the user selected at starting point of the process.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Concept based semantic web mining
dc.title.alternative	Kavrama dayalı anlamsal web madenciliği
dc.type	masterThesis
dc.date.updated	2018-08-06
dc.contributor.department	Bilgisayar Mühendisliği Ana Bilim Dalı
dc.subject.ytm	Data mining
dc.subject.ytm	Clustering
dc.subject.ytm	Multilingualism
dc.subject.ytm	Dictionary
dc.identifier.yokid	319156
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	BAHÇEŞEHİR ÜNİVERSİTESİ
dc.identifier.thesisid	216277
dc.description.pages	42
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_319156.pdf
Size:: 1.387Mb
Format:: PDF
Description:: File_319156

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess