Yazılım güvenliğinde derin öğrenme tabanlı kaynak kod analizi ve yorum önerimi

Kartal, Yusuf

dc.contributor.advisor	Özkan, Kemal
dc.contributor.author	Kartal, Yusuf
dc.date.accessioned	2023-11-10T08:37:13Z
dc.date.available	2023-11-10T08:37:13Z
dc.date.submitted	2023-11-06
dc.date.issued	2023
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/758454
dc.description.abstract	Modern kod incelemesi, güvenliği sağladığı, hataları erken tespit ettiği ve kod kalitesini iyileştirdiği için yazılım geliştirme süreçlerinde kritik bir adımdır. Ancak, manuel incelemeler zaman alıcı ve güvenilmez olabilmektedir. Otomatik kod incelemesi bu sorunları çözebilir. Kod inceleme yorumlarını önermek için önerilmiş derin öğrenme yöntemleri olsa da, bunların eğitilmesi ve çalıştırılması maliyetlidir. Bunun yerine, otomatik kod incelemesi için bilgi erişim tabanlı yöntemler verimlilik, etkililik ve esneklik açısından umut verici sonuçlar sergilemektedir. Ana hedef, otomatik kod incelemede en iyi sonuçları veren vektöre dönüştürme yöntemi ile benzerlik yönteminin optimal kombinasyonunu belirlemek ve böylece bilgi erişim tabanlı yöntemlerin performansını ölçmektir. Ayrıca ön işlemlerin modellerin başarısı üzerindeki etkisini incelemek de hedefler arasında bulunmaktadır. Önceki araştırmalardan (TF-IDF ve Bag-of-Words) farklı olan birden fazla vektörleştirme yöntemi (Word2Vec, Doc2Vec ve Transformer) ve benzerlik yöntemi (Kosinüs, Öklid ve Manhattan) kaynak kod metinleri arasındaki anlamsal benzerlikleri belirlemek için çalışmaya dahil edilmiştir. BLUE, METEOR ve ROUGE-L gibi standart metrikleri kullanarak bu yöntemlerin performansı değerlendirilmiş ve modellerin çalışma süreleri de sonuçlara dahil edilmiştir. Elde edilen sonuçlara göre Transformer modeli tüm standart metriklerde ve benzerlik ölçümlerinde son çalışmalara göre daha iyi performans göstermektedir. Ayrıca tam eşleşme sağlamada /%19,1'lik ve benzer öneriler sağlamada /%6,2'lik bir iyileşme görülmektedir. Elde edilen bulgular, transformer modelinin, insanlar tarafından yazılanlara çok benzeyen kod inceleme yorumları önermek için oldukça etkili ve verimli bir yaklaşım olduğunu, otomatik kod inceleme sistemleri geliştirmek için değerli bilgiler sağladığını göstermektedir.
dc.description.abstract	Modern code review is a critical step in software development as it ensures security, detects errors early and improves code quality. However, manual reviews can be time-consuming and unreliable. Automatic code review can fix these issues. While deep learning-based technics are proposed for recommending code review comments, they are costly to train and run. Instead, information retrieval-based methods for automated code review show promising results in efficiency, effectiveness, and flexibility. The main objective is to determine the optimal combination of the vector conversion and similarity methods that gives the best results in automatic code review, thus measuring the performance of information retrieval-based methods. It is also among the objectives to examine the effect of preprocessing on the success of the models. Different from previous studies (TF-IDF and Bag-of-Words), multiple vectorization methods (Word2Vec, Doc2Vec, and Transformer) and similarity methods (Cosine, Euclidean, and Manhattan) were included in the study to determine semantic similarities between source codes. The performance of these methods was evaluated using standard metrics such as BLUE, METEOR, and ROUGE-L, and the running times of the models were also included in the results. According to the results, the Transformer model performs better in all standard metrics and similarity measurements than in recent studies. In addition, there is an improvement of 19.1/% in providing an exact match and an improvement of 6.2/% in providing similar recommendations. The findings show that the transformer model is highly effective and efficient in suggesting code review comments similar to those written by humans, providing valuable information for developing automated code review systems.	en_US
dc.language	Turkish
dc.language.iso	tr
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Yazılım güvenliğinde derin öğrenme tabanlı kaynak kod analizi ve yorum önerimi
dc.title.alternative	Deep learning based source code analysis and review recommendations in software security
dc.type	doctoralThesis
dc.date.updated	2023-11-06
dc.contributor.department	Bilgisayar Mühendisliği Ana Bilim Dalı
dc.subject.ytm	Software development
dc.subject.ytm	Malware analysis
dc.identifier.yokid	10307658
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	ESKİŞEHİR OSMANGAZİ ÜNİVERSİTESİ
dc.identifier.thesisid	827286
dc.description.pages	108
dc.publisher.discipline	Bilgisayar Yazılımı Bilim Dalı

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess