Fast and efficient model parallelism for deep convolutional neural networks

Eserol, Burak

dc.contributor.advisor	Özdal, Muhammet Mustafa
dc.contributor.advisor	Aykanat, Cevdet
dc.contributor.author	Eserol, Burak
dc.date.accessioned	2020-12-29T08:01:05Z
dc.date.available	2020-12-29T08:01:05Z
dc.date.submitted	2019
dc.date.issued	2020-03-11
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/353049
dc.description.abstract	Konvolüsyonel sinir ağları son yıllarda çok popüler ve başarılı bir hale geldiler. Konvolüsyonel sinir ağlarının bu başarıyı elde etmesinde derinlikleri ve içerdikleri parametre sayıları önemli bir faktördür. Fakat, derin konvolüsyonel sinir ağlarını tek bir makinenin hafızasına sığdırmak zor bir hale gelmiştir ve bu sinir ağlarını eğitmek çok uzun süreler gerektirir. Bu problemi çözmek için iki adet paralelleştirme yöntemi mevcuttur: veri paralelleştirmesi ve model paralelleştirmesi.Veri paralelleştirmesinde sinir ağları modeli bir çok farklı makineye kopyalanmaktadır ve veri bu makineler arasında bölüntülenmektedir. Her bir kopya, kendisine atanmış veriyi eğitir ve modelin parametrelerini ve parametrelerin değişimlerini diğer kopyalara gönderir. Bu süreç veri paralelleştirmesinde çok büyük bir iletişim yoğunluğuna sebep olur. Bu yoğunluk eğitim sürecini yavaşlatır ve derin sinir ağlarının sonuca yakınsamasını geciktirir. Model paralelleştirmesinde ise derin bir sinir ağı modeli farklı makinelere bölüntülenmektedir ve her bir bölüntü peşi sıra şekilde çalışmaktadır. Fakat, modelin nasıl bölüntüleneceğine karar vermek için bir uzman kişi gereklidir ve bu bölüntüleme işleminde var olan bölüntüleme yöntemlerini kullanarak düşük iletişim yoğunluğu ile birlikte düşük iş dengesizliği oranı elde etmek zordur.Bu tezde yeni bir model paralelleştirme yöntemi olan hipergrafik bölüntülenmiş model paralelleştirme önerilmiştir. Bu yöntem bölüntüleme işlemi için uzman bir kişi gerektirmez ve var olan model paralelleştirme yöntemlerine göre daha iyi iş dengesizliği oranı ile birlikte daha iyi iletişim yoğunluğu elde etmektedir. Ek olarak, bu yeni önerilen yöntem veri paralelleştirme yönteminde ortaya çıkan iletişim yoğunluğunu %93 oranında azaltmaktadır. Son olarak ise önerilen yöntemin var olan paralelleştirme yöntemlerinden daha hızlı bir şekilde sonuca yakınsadığı gösterilmiştir.
dc.description.abstract	Convolutional Neural Networks (CNNs) have become very popular and successful in recent years. Increasing the depth and number of parameters of CNNs has crucial importance on this success. However, it is hard to fit deep convolutional neural networks into a single machine's memory and it takes a very long time to train these deep convolutional neural networks. There are two parallelism methods to solve this problem: data parallelism and model parallelism. In data parallelism, the neural network model is replicated among different machines and data is partitioned among them. Each replica trains its data and communicates parameters and their gradients with other replicas. This process results in a huge communication volume in data parallelism, which slows down the training and convergence of the deep neural network. In model parallelism, a deep neural network model is partitioned among different machines and trained in a pipelined manner. However, it requires a human expert to partition the network and it is hard to obtain low communication volume as well as a low computational load balance ratio by using known partitioning methods.In this thesis, a new model parallelism method called hypergraph partitioned model parallelism is proposed. It does not require a human expert to partition the network and obtains a better computational load balance ratio along with better communication volume compared to the existing model parallelism techniques. Besides, the proposed method also reduces the communication volume overhead in data parallelism by 93/%. Finally, it is also shown that distributing a deep neural network using the proposed hypergraph partitioned model rather than the existing parallelism methods causes the network to converge faster to the target accuracy.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Fast and efficient model parallelism for deep convolutional neural networks
dc.title.alternative	Derin konvolüsyonel sinir ağları için hızlı ve verimli model paralelleştirmesi
dc.type	masterThesis
dc.date.updated	2020-03-11
dc.contributor.department	Bilgisayar Mühendisliği Anabilim Dalı
dc.subject.ytm	Deep learning
dc.subject.ytm	Artificial neural networks
dc.subject.ytm	Parallel computing
dc.identifier.yokid	10283770
dc.publisher.institute	Mühendislik ve Fen Bilimleri Enstitüsü
dc.publisher.university	İHSAN DOĞRAMACI BİLKENT ÜNİVERSİTESİ
dc.identifier.thesisid	575184
dc.description.pages	97
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10283770.pdf
Size:: 1.132Mb
Format:: PDF
Description:: File_10283770

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess