Load related feature engineering for query execution time prediction

Yenigün, Yalçin

dc.contributor.advisor	Özgövde, Bahri Atay
dc.contributor.author	Yenigün, Yalçin
dc.date.accessioned	2020-12-04T13:09:51Z
dc.date.available	2020-12-04T13:09:51Z
dc.date.submitted	2018
dc.date.issued	2018-12-04
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/86890
dc.description.abstract	Sorguların çalışma süresini tahmin etmek ilişkisel veri tabanları için en zor konulardan biridir ve bu tahminin doğru gerçekleşmesi, veri tabanı yönetimi, kaynak yönetimi, sistemin performansının izlenmesi ve sorguların zamanlamasının yönetimi gibi birçok konuda faydalıdır. Birçok sorgu iyileştiren yazılım, sorguların çalışma süresini tahmin edebilmek için maliyet tabanlı modeller kullanır fakat ilgili problem daha karmaşıktır zira veri tabanı sistemlerinin donanım ve yazılımlarının heterojen olması işlemci ve G/Ç maliyetlerinin ölçümünü çok zor kılmaktadır. İlişkisel veri tabanı üreticileri, yönetimi ve performansı otomatik hale getiren, kendi kendine çalışan veri tabanı sistemleri geliştirmeye çalışmaktadırlar. Bu noktada veri tabanı sorgularının çalışmadan önce ne kadar süreceğini tahmin etmek kilit bir özelliktir. Geçmiş çalışmalar sorgu süresini tahmin edebilmek için sentetik veri kullanmışlardır. Bu nedenle farklı alanlarda yapay öğrenme deneylerini tekrar etmek neredeyse imkânsız hale gelmektedir. Bu makalede, bir ödeme hizmet sağlayıcısının gerçek dünyadaki farklı yükler altındaki verisi kullanılmış ve veri tabanı sorguları zaman pencereleri içerisinde toplanarak üretilen yeni öznitelik kümesi sunulmuştur. Bu sunulan öznitelik kümesi geleneksel sorgu planı öznitelikleriyle karşılaştırılmış ve sonuçlar paylaşılmıştır. İlgili veri yaygın bir veri toplama aracıyla toplanmış bu sayede yapılan yapay öğrenme deneyleri ve oluşturulan modeller çeşitli alanlarda kolayca tekrar edilebilir hale gelmiştir.
dc.description.abstract	Prediction of query execution time is one of the most challenging issues for relational databases and is useful for database administration, resource management, system monitoring and query scheduling. Most of the query optimizers use cost-based models for query execution time prediction but the problem is more complex because the heterogeneity of the database system's hardware platforms and operating systems makes more difficult to measure CPU and I/O costs. The relational database vendors try to implement autonomous databases which automates management and performance thus intelligent query execution time prediction is a key issue. Previous work mostly used synthetical data so that reproducing machine learning experiments are almost impossible for various domains. In this thesis, we use real-world data of a payment service provider with different workloads and we propose new sets of features based on aggregating the database queries and compared them with traditional query plan features. We collected data from a common machine data tool so that reproducing ma-chine learning experiments and building models are easy for various domains.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Load related feature engineering for query execution time prediction
dc.title.alternative	Sorguların çalışma süresinin tahmini için yükle ilişkili öznitelik mühendisliği
dc.type	masterThesis
dc.date.updated	2018-12-04
dc.contributor.department	Bilgisayar Mühendisliği Anabilim Dalı
dc.identifier.yokid	10205090
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	GALATASARAY ÜNİVERSİTESİ
dc.identifier.thesisid	521833
dc.description.pages	58
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10205090.pdf
Size:: 1.636Mb
Format:: PDF
Description:: File_10205090

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess