Optimization methods for large-scale distributed query processing on linked data

Oğuz, Damla

dc.contributor.advisor	Ergenç Bostanoğlu, Belgin
dc.contributor.advisor	Oğuz, Damla
dc.contributor.author	Oğuz, Damla
dc.date.accessioned	2020-12-29T08:42:26Z
dc.date.available	2020-12-29T08:42:26Z
dc.date.submitted	2017
dc.date.issued	2020-07-18
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/369690
dc.description.abstract	Bağlı Veri sağlayıcılarının sayısı arttıkça, Veb büyük bir küresel veri uzayı haline gelmektedir. Sorgu federasyonu, bu dağıtılmış veri uzayını sorgulamak için kullanılan yaklaşımlardan biridir. Bu yaklaşımdaki sorgu eniyileme, yanıt süresini ve servis süresini en aza indirgemeyi hedeflemektedir. Yanıt süresi ilk sonuç kaydını oluşturmak için geçen zaman anlamına gelirken, servis süresi tüm sonuç kayıtlarını sağlamak için geçen zamana karşılık gelmektedir. Sorgu federasyonunda sorgu eniyileme ile ilgili çalışmaların çoğu, yürütmeden önce sorgu planları oluşturan ve istatistiklere ihtiyaç duyan statik sorgu eniyilemeye odaklanmaktadır. Bununla birlikte, Bağlı Veri ortamının öngörülemeyen veri geliş hızları ve güvenilmez istatistikler gibi çeşitli zorlukları bulunmaktadır. Sonuç olarak, statik sorgu eniyileme verimsiz yürütme planlarına neden olabilmektedir. Bu kısıtlamalar, Bağlı Veri üzerinde sorgu federasyonu için uyarlanabilir sorgu eniyileme kullanılması gerektiğini göstermektedir. Bu tezde ilk olarak SPARQL uç noktaları üzerinden gerçekleştirilen sorgu federasyonu için yanıt süresini ve servis süresini en aza indirmeyi hedefleyen bir uyarlanabilir birleştirme operatörü önerilmiştir. İkinci olarak, servis süresini daha da azaltmak amacıyla ilk öneri geliştirilmiştir. Her iki öneri de uyarlanabilir sorgu eniyileme kullanarak yürütme sırasında birleştirme yöntemini ve birleştirme sırasını değiştirebilmektedir. Önerilen operatörler, ilişkilerin farklı veri geliş hızlarıyla ve ilgili istatistiklerin eksiklikleriyle başa çıkabilmektedirler. Bu tezin performans değerlendirmesi, önerilen operatörlerin yanıt süresi ile servis süresi arasında en iyi dengeyi sağladığını göstermektedir. Temel amaç farklı veri geliş hızlarının üstesinden gelmek olsa da performans değerlendirmesi önerilerin hem sabit hem de farklı veri geliş hızlarında başarılı olduklarını ortaya koymaktadır.
dc.description.abstract	As the number of Linked Data providers increases, the Web becomes a huge global data space. Query federation is one of the approaches for querying this distributed data space. Query optimization in this approach aims to minimize the response time and the completion time. Response time is the time to generate the first result tuple, whereas completion time refers to the time to provide all result tuples. Most of the studies of query optimization in query federation focus on static query optimization which generates the query plans before the execution and needs statistics. However, the environment of Linked Data has several difficulties such as unpredictable data arrival rates and unreliable statistics. As a consequence, static query optimization can cause inefficient execution plans. These constraints show that adaptive query optimization should be used for federated query processing on Linked Data. In this thesis, we first propose an adaptive join operator which aims to minimize the response time and the completion time for federated queries over SPARQL endpoints. Second, we extend our first proposal to further reduce the completion time. Both proposals can change the join method and the join order during the execution by using adaptive query optimization. The proposed operators can handle different data arrival rates of relations and the lack of statistics about them. The performance evaluation of this thesis shows that the proposed adaptive operators provide the best trade-off between the response time and the completion time. Even though the main objective is to manage different data arrival rates of relations, the performance evaluation reveals that they are successful in both fixed and different data arrival rates.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Optimization methods for large-scale distributed query processing on linked data
dc.title.alternative	Büyük ölçekli dağıtık bağlı veri üzerinde sorgu işleme için eniyileme yöntemleri
dc.type	doctoralThesis
dc.date.updated	2020-07-18
dc.contributor.department	Bilgisayar Mühendisliği Anabilim Dalı
dc.identifier.yokid	10156912
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	EGE ÜNİVERSİTESİ
dc.identifier.thesisid	479670
dc.description.pages	118
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10156912.pdf
Size:: 5.111Mb
Format:: PDF
Description:: File_10156912

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess