Developing predictive models for biodiesel from algae using data in published literature

Coşgun, Ahmet

dc.contributor.advisor	Yıldırım, Ramazan
dc.contributor.advisor	Günay, Mehmet Erdem
dc.contributor.author	Coşgun, Ahmet
dc.date.accessioned	2020-12-04T10:12:10Z
dc.date.available	2020-12-04T10:12:10Z
dc.date.submitted	2018
dc.date.issued	2019-01-22
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/72541
dc.description.abstract	Bu tezin amacı mikroalglerden lipit üretimi üzerine yayınlanmış makaleleri inceleyerek kapsamlı bir veri tabanı geliştirmek, bu veri tabanını kullanarak bilgi çıkarımı yapmak ve daha önce yapılmamış deneylerin sonuçlarını tahmin etmek için veri madenciliği algoritmalarını kullanmaktır. Veri tabanı 106 farklı makaleden 5908 veriyle oluşturulmuş olup veritabanı raporlanan sonuç değişkenine göre iki gruba ayrılmıştır. Sonuç değişkenleri biyokütle üretimi (mg / Ld) ve lipit içeriği (w / w) olarak alınmıştır. Giriş değişkenlerinin sonuç değişkenlerine etkisi, aynı giriş değişkeninin etkisiyle ilgilenen makalelerin karşılaştırılması yoluyla ön analiz olarak incelenmiştir. Bilgi çıkarımı ve tahmin-sınıflandırma amaçları için, MATLAB ve R'nin kütüphaneleri ve fonksiyonları kullanılarak her iki veri setine ilişkilendirme kural madenciliği, karar ağacı ve yapay sinir ağı algoritmaları uygulanmıştır. İlişkilendirme kural madenciliği ile Chlorella, Chlorococcum ve Nannochloropsis mikroalg türlerinin yüksek miktarda biyokütle üretimi ve lipit içeriğine sahip olabileceği bulunmuştur. Sınıflandırma amaçlı modeller doğruluğa, tahmin amaçlı modeller, standart hata, karesel ortalama hata ve determinasyon katsayılarına göre karşılaştırılmış ve değerlendirilmiştir. Veri tabanı rastgele olarak eğitim ve test setine bölünmüş ve eğitim seti model kurmak için kullanılırken test seti karesel ortalama hata ve determinasyon katsayısını bulmak için kullanılmıştır. Sınıflandırma için karar ağacı algoritması kullanılarak oluşturulan optimum modeller, biyokütle üretimi için % 77.8, lipit içeriği için % 62.2 doğruluk ile sonuçlanmıştır. Öngörülü modelleme için yapay sinir ağı algoritması kullanılmıştır. Standart hata, karesel ortalama hata ve determinasyon katsayıları, biyokütle üretimi ve lipit içeriği modelleri için 50, 80 ve 0.7 ve 7, 11, 0.3 şeklinde bulunmuştur. Lipit içeriği için yapılandırılmış modellerin tahmin gücü, biyokütle üretimi kadar güçlü çıkmamıştır. Girdi önem analizi, biyokütle üretimi için besinsel değişkenlerin en belirleyici değişkenler olduğunu, mikroalg tipinin ise lipit içeriği için en belirleyici değişken olduğunu göstermiştir.
dc.description.abstract	The aim of this thesis was to develop a comprehensive database from published articles about the lipid production from microalgae; then, to use this database for knowledge extraction by employing data mining algorithms to estimate the results of unperformed experiments. A total number of 106 articles were used to construct the database with 5908 instances. Dataset was divided into two groups with respect to reported output variables, which were biomass production (mg/L d), and lipid content (w/w). As the preliminary analysis, the effect of each input variable was investigated by comparing the related articles. Then, for knowledge extraction and prediction-classification purposes, association rule mining, decision tree, and artificial neural network algorithms were applied to both datasets, by using libraries and functions of MATLAB and R. Association rule mining algorithm was implemented to all continuous and categorical variables to examine their effects on output variable, where Chlorella, Chlorococcum, and Nannocholoropsis species are found to yield high biomass production and high lipid content. Models were compared and evaluated by their accuracy in classification and standard error, root mean square error, and r-squared values in predictive analysis. Parameter tuning was done by randomly dividing the dataset into two sets, as the testing and the training sets, where the training set was used to construct the model, and the testing set was used to calculate the root mean square error and the r-squared values. The optimum models constructed using decision tree algorithm for classification gave 77.8% overall accuracy for biomass production, and 62.2% for lipid content. Artificial neural network algorithm was used for predictive modeling. Absolute error, root mean square error, and r-squared values of the optimum model for biomass production was, 50, 80, and 0.7, and 7, 11, 0.3 for lipid content. Predictive power of the constructed models for lipid content was not as strong as biomass production. The input significance analysis showed that nutritional variables were found to be the most deterministic variables for biomass production, whereas microalgae type was found to be the most deterministic variable for lipid content.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Kimya Mühendisliği	tr_TR
dc.subject	Chemical Engineering	en_US
dc.title	Developing predictive models for biodiesel from algae using data in published literature
dc.title.alternative	Yayınlanmış makalelerden alglerin biodizel üretimi ile ilgili öngörülü model geliştirilmesi
dc.type	masterThesis
dc.date.updated	2019-01-22
dc.contributor.department	Diğer
dc.identifier.yokid	10207898
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	BOĞAZİÇİ ÜNİVERSİTESİ
dc.identifier.thesisid	527128
dc.description.pages	130
dc.publisher.discipline	Kimya Mühendisliği Bilim Dalı

Files in this item

Name:: yokAcikBilim_10207898.pdf
Size:: 2.785Mb
Format:: PDF
Description:: File_10207898

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess