Intrusion detection model based on data mining and machine learning

Kadhim, Khalid Abdulwahid Kadhim

View/Open

File_10208380 (2.936Mb)

Date

2018

Author

Kadhim, Khalid Abdulwahid Kadhim

Metadata

Show full item record

Abstract

Son zamanlarda, çevrimiçi hizmetlerin kullanımı hızla artmıştır, bu da bu hizmetlerin kalitesini etkilemeden bu hizmetleri sağlayan sunucuları koruma ihtiyacını doğurmaktadır.İzinsiz kullanıcıların kullandıkları saldırı tekniklerinin gelişim göstermesinden sonra, geleneksel ağ koruma teknikleri artık yeterli olmamaktadır. Böylece, ağlarda daha iyi koruma sağlamak için daha karmaşık teknikler kullanılması gerekmektedir. Veri madenciliği (Data mining), paket bilgileri ve bunlara verilen etiketler arasındaki ilişkileri çıkarmak için kullanılabilecek makine öğrenme alanlarından biridir.Bu yüzden, bu çalışmada, Destek Vektör Makinesi(Support Vector Machine), Rastgele Orman(Random Forest) ve İleri Beslemeli Sinir Ağları(Feed-forward Neural Networks) olmak üzere üç farklı veri madenciliği sınıflandırma tekniği, ağa gelen paketlerdeki anormallikleri tespit etmek için kullanılmaktadır. Ayrıca, yapılan saldırı türü de bir saldırı tespit edildiğinde bu sınıflandırıcılar tarafından da algılanır.Sonuçlar, her bir gizli katmanında 32 nöron bulunan İleri Beslemeli Sinir Ağları'nın, her bir tahmin için ortalama tahmin süresi olarak 0.7 uSec harcayarak %99.27'lik bir tahmin doğruluğu ile en iyi genel performansa sahip olduğunu göstermektedir. Aynı zamanda 100 ağaçlı Rastgele Orman sınıflandırıcısı, % 99.60'lık bir doğruluk elde ederken, her bir tahmin başına ortalama 8.54 uSec tüketir ve bu da derin öğrenme modeline kıyasla çok yüksek bir zamandır. Ayrıca, Destek Vektör Makine sınıflandırıcısı, her bir tahmin için %98.70'lik bir doğruluk ve 218.3 uSec'lik bir ortalama yürütme süresine sahip olmuştur.Son olarak, Çok Sınıflı Sınıflandırmada, aynı gizli katmanlara sahip derin öğrenme modeli, en iyi tahmin doğruluğunu ve zamanını % 90.82 doğruluk ve 0.89 uSec ortalama tahmin süresi ile gösterirken, Rastgele orman sınıflandırıcısı sadece %87.92 başarı oranı ve tahmin başına ortalama 17.28 uSec ve Destek Vektör Makine sınıflandırıcısı %70.43'lük bir doğruluk oranı ve tahmin başına ortalama 709,65 uSec tüketmektedir. Bu sonuçlar, İleri Beslemeli Derin Sinir Ağının bir saldırı tespit sisteminde kullanılacak en iyi seçim olduğunu göstermektedir.

Recently, the use of online services has grown rapidly, which imposes the need to protect servers that provide these services without affecting the quality of these services. Traditional network protection techniques are no longer applicable, according to the development of the intrusion techniques being used by intruders. Thus, more complex techniques are being used to provide better protection to these networks. Data mining is one of the machine learning fields that can be used to extract relations between packets information and the labels given to them. Thus, in this study, three different data mining classification techniques, which are the Support Vector Machine, Random Forest and Feed-Forward Neural Networks are evaluated to detect anomalies in the packets incoming to the network. Then, the type of attack being executed is also detected by these classifiers, in case an intrusion is detected.The results show that the feed-forward deep neural network classifier, with only three hidden layers of 32 neurons each, has the best overall performance with a predictions accuracy of 99.27% in binary classification with an average prediction time of 0.7 uSec per each prediction, while the Random forest classifier, with 100 trees in the forest, has scored an accuracy of 99.60% but consumes an average of 8.54 uSec per each prediction, which is extremely high time compared to the deep learning model. Moreover, the support vector machine classifier has scored an accuracy of 98.70% and an average execution time of 218.3 uSec per each prediction.Moreover, in multi-class classification, the deep learning model with the same hidden layers has shown the best prediction accuracy and time with 90.82% accuracy and 0.89 uSec average prediction time, while the random forest classifier achieved an accuracy of only 87.92% consuming an average of 17.28 uSec per prediction and the support vector machine classifier has a prediction accuracy of 70.43% and consumes an average of 709.65 uSec per prediction. These results show that the feed-forward deep neural network is the best choice to be employed in an intrusion detection system.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/588572

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess