New technique for high dimensional data : robust linear regression using L1-penalized mm-estimation

S.A. Darwish, Kamal

dc.contributor.advisor	Büyüklü, Ali Hakan
dc.contributor.author	S.A. Darwish, Kamal
dc.date.accessioned	2020-12-29T09:39:04Z
dc.date.available	2020-12-29T09:39:04Z
dc.date.submitted	2015
dc.date.issued	2018-08-06
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/384890
dc.description.abstract	Son yıllarda büyük veriler çerçevesinde kullanılan p tahmin edicinin (açıklayıcı değişkenli) n gözlem sayısından daha fazla olma durumunda olan modeller oldukça popüler oldular.Bu veri setleri iyi tahmin edilmiş modeller için iyi birer rekabet ortamı oluşturmaktadırlar. Bununla birlikte, veri setlerinde belirli miktarda sapan değerlerin mevcudiyeti ve dahi bazı veri setini bozucu (kontaminasyonlar) unsurların varlığı doğrusal lineer modellerin çözümünü zorlaştırmaktadırlar. Bu durumlarda model çözümleri için metodların seyrek ve robust (dayanıklı) olması istenir. Bu tezde, yeni bir tahmin metodu olarak MM tahmincisi ve L1- Penalized MM tahmincisi( MM-Lasso) kullanıldı. İleri sürülen tahmin edici, başlangıç tahmin edicisi olarak sparse LTS tahmin edicisi ile M tahmin edicilerini cezalandırarak seyrek model tahminlerini yüksek bozucu değerleri de kapsayarak iyi tahminler vermesi sağlandı. MM-Lasso C programlama dili ile yazıldı ve R paketi içerisinden de çalıştırılabilir özellik kazandırıldı. İleri sürdüğümüz modeli değerlendirmek için mevcut SimFrame R paketini geliştirdik, bu da istatistiksel olarak simülasyon çalışmaları için bir çerçeve oluşturdu. Üç değişik model geliştirilerek düşük, orta ve büyük boyutlu veriler eldeedildi. Aynı zamanda simülasyon çalışmaları çerçevesinde Kirlenmiş veri oluşturabilmek için fonksiyon geliştirildi. Kaldıraç verilerinin varlığı halinde yapılan incelemelerde MM-Lasso tahmin edicisinin diğer rakiplerinden daha iyi bir performans sergilediği görülmektedir.
dc.description.abstract	Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this thesis, we employed the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso) as a new estimation method. Our proposed estimator uses sparse LTS estimator as initial estimator to compute penalized M-estimator getting sparse modeli estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language and calling it from R package.To evaluate our proposed estimator, we extended the SimFrame R package, which is a general framework for simulation studies in statistics. We generated three data models to represent low, moderate and high dimensional data. We also implemented the function for generating the data for the contamination. Simulation study shows that the MM-lasso estimation has better prediction performance than its competitors in the presence of leverage points.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	İstatistik	tr_TR
dc.subject	Statistics	en_US
dc.title	New technique for high dimensional data : robust linear regression using L1-penalized mm-estimation
dc.title.alternative	Büyük boyutlu verıler ıçın yenı bır teknık: L1–cezalı doğrusal robust mm-tahnıncısı
dc.type	doctoralThesis
dc.date.updated	2018-08-06
dc.contributor.department	İstatistik Anabilim Dalı
dc.identifier.yokid	10080667
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	YILDIZ TEKNİK ÜNİVERSİTESİ
dc.identifier.thesisid	406556
dc.description.pages	118
dc.publisher.discipline	İstatistik Bilim Dalı

Files in this item

Name:: yokAcikBilim_10080667.pdf
Size:: 1.178Mb
Format:: PDF
Description:: File_10080667

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess