Akan veri işleyen dağıtık sistemlerde dinamik ölçekleme

Kavi, Mert

dc.contributor.advisor	Orman, Zeynep
dc.contributor.author	Kavi, Mert
dc.date.accessioned	2020-12-10T07:27:49Z
dc.date.available	2020-12-10T07:27:49Z
dc.date.submitted	2019
dc.date.issued	2020-01-20
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/214926
dc.description.abstract	Gelişen teknoloji ile birlikte kuruluşlar büyük hacimde ve hızda veri üretmektedir. Sayısı sürekli artan veri kaynakları birçok veri akışı oluşturmaktadır. Oluşan bu verileri gerçek zamanlı olarak işleyebilmek ve analiz edebilmek, uygulamaları devamlı olarak gözlemleyebilmek ve bu sayede müşterilerine kişiselleştirilmiş teklifler vererek ürün önerileri yapabilmek kurumlara rekabet ortamında çok ciddi avantajlar sağlamaktadır. Bununla birlikte, akan veriyi işleyen dağıtık sistemleri inşa etmek ve operasyonunu sağlamak oldukça karmaşık ve maliyetli bir süreçtir. Sistemlerin, veri akışının değişen hızlarına adapte olabilmesi ve gerektiğinde ölçeklenmesi gerekmektedir. Bu nedenle, akan veriyi işleyen dağıtık sistemlere entegre edilecek etkin bir otomatik ölçekleme sistemi kullanılması çoğu zaman kaçınılmazdır. Özellikle, son yıllarda, hızla artan akan veri kaynaklarını işleyebilen sistemlere olan ilgi oldukça artmıştır ve literatürde de bu alanda yapılan çok sayıda çalışma bulunmaktadır. Bu çalışmaların birçoğu sistemin değişen iş yüklerine adapte olabilmesi ve ölçeklenebilirlik konusundan ziyade sistemin olağan şartlarda nasıl çalışacağı üzerine bir takım sistem önerileri üzerine yoğunlaşmıştır. Ölçeklenebilirlik üzerine literatürde az sayıda çalışma bulunmaktadır. Ayrıca, Apache Flink üzerine yapılan çalışma sayısı da oldukça azdır.Bu tez çalışmasında, yukarıda bahsettiğimiz problemlerden ve literatürdeki bu eksikliklerden yola çıkılarak, Apache Flink üzerinde çalışan değişen çalışma yüklerine adapte olabilen bir sistem tasarımı önerilmiştir. Büyük veri işleyen sistemlere entegre çalışabilecek bu model ile sistem performanslarının geliştirilmesi hedeflenmiştir. Apache Flink hem geliştirme yapacağımız sistem hem de ölçekleme metriklerini alıp hesaplama yapan bir bileşen olarak kullanılmıştır. Apache Flink metrikleri, Akka kullanılarak Apache Kafka'ya gönderilmiştir. Apache Kafka sistem metriklerini toplayan ve dağıtan bir mesaj kuyruğu olarak konumlandırılmıştır. Apache Flink'in sağladığı metrikler içerisinden en doğru sonuca ulaştıracak metrikler seçilmiş ve oluşturulmuştur. Sistemin hangi durumlarda ölçeklendiği ve ölçeklemeden sonraki durumu analiz edilmiştir. Elde edilen sonuçlar simülasyon çalışmaları ile gösterilerek önerilen sistem metodolojisinin etkinliği test edilmiştir.
dc.description.abstract	With the emerging technology, organizations produce data at large volumes and speed. Growing number of data sources create many data streams. The ability to process and analysis these data in real time, monitor the applications continuously and giving personalized offers to customers provide huge competitive advantages to corporations. Establishing large-scale distributed stream processing systems and ensuring their operations is a very complex and costly process. These systems should be capable of adapting the varying rates of data stream and they must be scaled, if required. It is usually inevitable to use an effective automatic scaling system which can be integrated into such systems. In recent literature, there are numerous studies on this issue. Many of these studies have focused on how these systems will operate under normal conditions. There are limited studies on scalability where scaling is usually implemented with a set of resources. In addition, number of studies on Apache Flink is quite few. In this study, based on these shortcomings, a system design which can adapt to changing working loads and work on Apache Flink, is proposed. Apache Flink is used for both system development and calculating the scaling metrics. Scaling is performed by evaluating the expected latency calculated and some critical metrics. It is aimed to improve system performances and reduce quality losses with this model, which can be integrated into big data processing systems. Pre-scaling and post-scaling cases are also demonstrated by simulations to show the effectiveness of the proposed system.	en_US
dc.language	Turkish
dc.language.iso	tr
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Akan veri işleyen dağıtık sistemlerde dinamik ölçekleme
dc.title.alternative	Dynamic scaling at distributed data stream processing systems
dc.type	masterThesis
dc.date.updated	2020-01-20
dc.contributor.department	Bilgisayar Mühendisliği Anabilim Dalı
dc.subject.ytm	Scaling
dc.subject.ytm	Distributed systems
dc.subject.ytm	Big data
dc.identifier.yokid	10251782
dc.publisher.institute	Lisansüstü Eğitim Enstitüsü
dc.publisher.university	İSTANBUL ÜNİVERSİTESİ-CERRAHPAŞA
dc.identifier.thesisid	603603
dc.description.pages	60
dc.publisher.discipline	Bilgisayar Mühendisliği Bilim Dalı

Files in this item

Name:: yokAcikBilim_10251782.pdf
Size:: 4.100Mb
Format:: PDF
Description:: File_10251782

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess