Investigation of amazon and google for fault tolerance strategies in cloud computing services

Alraheym, Shereen

View/Open

File_10065470 (1.598Mb)

Date

2015

Author

Alraheym, Shereen

Metadata

Show full item record

Abstract

Bulut bilişim, sanal makinalarla İnternet üzerinden kaynak paylaşımı ve kullanımı sunarak istek-üzeri bilişim teknolojileri çözümleri sunmaktadır; bu sayede, son zamanlarda çekici bir araştırma başlığı olmuştur. Bu tür servislerin hızlı gelişimleri sonucunda, bulutta hata toleransı gerekliliği ana konulardan birisi olmuştur. Bu gereklilik, bu yeni servislerin güvenilirliği ve her zaman hazır olması şartlarıyla daha da titizlik gösterilmesiyle öne çıkmaktadır. Bu tez, bulut servislerinde ve kullanıcıların sistemlerinde ve işyerlerinde işleyiş hatalarından koruyacak teknikleri ve olanakları incelemektedir. Bulgulardan birisinin de gösterdiği üzere, bulut kullanan servislerde hatalar olasıdır, beklenmelidir ve bu yüzden çözüm yolları bulunmalıdır. Hata tolerans stratejilerinin esas nitelikleri işin devamlılığını, finansal kayıpların önlenmesini, sistemin tekrardan ayağa kaldırılmasını ve felaketten kurtulmayı, güvence altına almalıdır. Tezin kapsamında, işleyiş bozukluklarının, tekrarlamalarla, kontrol noktalarıyla ve yedeklemeyle önlenmesi ve düzeltilmesi senaryoların incelenmesi vardır. Altyapıları hatayı tolere edebilen Amazon'un ticariviiAWS'i ve Google'ın ticari GCE'si gibi servis olarak altyapı sağlayıcılar örnek olarak incelenmiştir ki bu sayede hata tolerans özelliği olan sağlam mimariler, sistem ve işyerlerine önerilebilsin. Bunlar temel alınarak, olası çöküşlerle nasıl ilgilenilebileceğini, hata tolerans tekniklerini, takım çalışması düzenlenmesini içeren iş politikaları olşturulması ileriki çalışmalara bırakılmıştır. Bu kısıtlamalara yönelik araştırmalar, çevrim içi anketlerle, daraltılmış kapsamlar için bilgi toplanarak yapılabilir.

Cloud computing has recently become an attractive topic due to its ability to offer information technology solutions through virtual machines as on-demand services to share and consume resources over the Internet. As a result of rapid development in such services, the necessity of fault tolerance in the cloud is a major concern with reliability, availability and dependability is more critical to this new service type. This thesis investigates techniques and means of tolerating cloud services as well as cloud customers' systems/enterprises execution over the cloud safe from failures. As one of this findings shows, failures in cloud enabled services should be expected to occur hence they should be handled. The essential features of implementing fault tolerance strategies guarantee the business continuity, avoid financial lost, recovering systems from failures, and provide disaster recovery as well. The specific focus is to explore scenarios of avoiding/recovering from failures through redundancy, checkpoint and replication. Commercial IaaS providers such as Amazon's AWS andvGoogle's GCE are taken as examples as they tolerate their infrastructure from failures; in this way a robust architecture with fault tolerance property could be proposed for a system/enterprise. Hence, a general conceptual model with fault tolerance considerations has been proposed. With this basis, addressing potential failures in detail, implementing fault tolerance techniques, organizing teamwork to setup business policies are left for future work. Such limitations can be addressed through online questionnaires to collect information for case studies.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/78389

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess