Makine öğrenmesi yöntemleri ile demans tahmini
- Global styles
- Apa
- Bibtex
- Chicago Fullnote
- Help
Abstract
Demans, bilişsel işlevde normal yaşlanmanın sebep olduğu etkinin ötesinde bir bozulma yaratan, birçok farklı beyin hastalığı içeren bir sendromdur. Hafıza fonksiyonunun bozulması demansın en kabul edilen tanımlarından biridir. Ayrıca bilişsel işlevdeki bozulmaya genelde sosyal davranışlarda, motivasyonda veya duygusal kontrolde bozulma da eşlik etmektedir.Yaşlanmanın en ciddi risk faktörü olduğu demans hastalığı, sağlık ve sosyal bakım açısından 21.yüzyılın en büyük küresel zorluğu olarak görülmekte ve gün geçtikçe dünya çapında daha da yaygınlaşarak sebebiyet verdiği ölüm oranları artmaktadır.Demansta başlangıç belirtileri kademeli olarak ilerlediği için erken evre genellikle göz ardı edilmektedir. Nöropsikolojik tarama testleri kişinin bilişsel, davranışsal ve psikolojik işleyişinin incelenmesi için kullanılan özellikle hafif bilişsel bozukluk ve demansın erken evrelerindeki hastaları değerlendirebilmek açısından oldukça önemli tanı araçlarıdır. Çalışmada, Open Access Series of Imaging Studies (OASIS) veri tabanından alınan OASIS-3 veri seti kullanılarak makine öğrenmesi yöntemleri ile Mini Mental Durum Muayenesi (MMSE) nöropsikolojik tarama testinin başarısını arttırmak ve demansın varlığına yönelik çıkarım amaçlanmıştır. Çalışma kapsamında 6179 hasta kaydıyla çalışılmış ve çalışmada kullanılacak öznitelikler belirlenmiştir. Veri setine uygulanan veri ön işleme yöntemleri sonrası k-en yakın komşuluk, destek vektör makineleri, karar ağaçları ve rastgele orman algoritmaları ile sınıflandırma yapılarak modellerin 10-katlamalı çapraz doğrulamadan elde edilen başarıları ölçülmüştür. Ayrıca sınıflandırma performansını arttırmak amacıyla 65 yaş ve 16 eğitim yılı eşik değer olarak baz alınarak veri seti dört gruba bölünerek yeniden sınıflandırma yapılmıştır. Modellerin dört gruptaki başarı ölçümleri yapılmış ve ortalama performansları hesaplanmıştır.Yapılan sınıflandırmaların ardından veri setinde yer alan özniteliklerin önem sıralamaları hesaplanmış, MMSE nöropsikolojik testi sınıflandırma için en önemli öznitelik olarak belirlenmiştir. MMSE test skorları, literatürde kabul edilen puanlama sistemi dahilinde veri setindeki teşhislerle karşılaştırılarak testin başarısı ölçülmüş ve modellerin başarılarıyla kıyaslanmıştır.Çalışma sonucunda, MMSE nöropsikolojik testi için ölçülen doğruluk, duyarlılık, yanlış negatif oranı ve özgüllük değerleri, makine öğrenmesi yöntemleri ile elde edilen sonuçlarla kıyaslanmış, nöropsikolojik test başarısının arttırıldığı gözlemlenmiştir. K-en yakın komşuluk, destek vektör makineleri ve rastgele orman algoritmalarının tüm veri seti üzerinde birbirine yakın performanslar sergiledikleri, yaş ve eğitime göre bölünen veri seti üzerinde başarılarını daha da arttırarak çalışma için uygun yöntemler oldukları görülmüştür. Dementia is a syndrome that include a lot of different brain diseases, creating a deformation more than the effects of normal aging, in cognitive function. Deterioration of the function of the memory is one of the most acceptable definitions of dementia. Besides, deterioration in social behaviors, motivation or emotional control accompanies the impairment in cognitive function, generally.While the aging is the most serious factor of dementia; gender, ethnicity, genetic factors and other individual diseases are also considered as risk factors. Dementia is considered as the hardest global problem of twenty first century in terms of health and social care and the death rates caused by dementia are increasing day by day.The symptoms of dementia are handled in three stage as early stage, middle stage and late stage. The early stage is dismissed because the starting symptoms progress gradually in general.The story of the progress of the disease is listened and neuropsychological screening tests are applied firstly at the clinical evaluation stage of dementia. In addition to these, laboratory tests, magnetic resonance imaging (MRI) for differential diagnosis and computed tomography are applied too. Neuropsychological screening tests are very significant diagnosis tools used to examine the cognitive, behavioral and psychological progress and to evaluate the patients in early stages of dementia and mild cognitive impairment but screening tests are not enough to definitive diagnosis, nevertheless.Machine learning is to program the computer to optimize performance using past experiences or data available. Identification of the information from real world by computers and improvement of performance based on this new information for certain tasks are to be provided.Nowadays, in the field of health, artificial intelligence supported systems have become widespread and play an active role in the diagnosis and treatment of many diseases. It is also used in applications such as medical data collection, drug discovery, robotic surgery.In this study, OASIS-3 data set obtained from Open Access Series of Imaging Studies (OASIS) database was used. The data set consisted of 609 adult individuals considered to be cognitively normal and 6224 records containing doctor control reports at different time intervals of 489 persons diagnosed at various stages of dementia ranging in age from 42 to 99 years.The aim of this study was to increase the success of neuropsychological tests in the data set by using machine learning methods and to determine the presence of dementia by means of the test.In the scope of the study, data pre-processing methods were applied to the data and the success of the models was measured with 10-fold cross validation. Firstly, records containing missing diagnostic information were removed from the data set and the information obtained without the need for medical examination was determined as an attribute.Data pre-processing methods were applied to the data within the scope of the study and the success of the models was measured with 10-fold cross validation. Firstly, records containing missing diagnostic information were removed from the data set and the information obtained without the need for medical examination was determined as an attribute. The missing data in 6179 records used in the study were completed according to the median or average for numerical attributes and mode value for categorical attributes. Then, the categorical attributes were converted to numeric attributes.The normalization methods and parameters were determined for the models and the classification was made with k-nearest neighborhood, support vector machines, decision tree and random forest algorithms. Then, the accuracy, recall, false negative ratio and specificity values of the models were measured by the complexity matrix obtained from 10-fold cross-validation. In addition, the data set was divided into four groups based on the age of 65 and 16 years of education, and the success of the models in the population under/above 65 years of age was examined considering the educational status. Mini Mental State Examination (MMSE) neuropsychological screening test scores in the data set were compared with patient diagnoses within the scoring system accepted in the literature and the success of the test was measured and compared with the success of the models.After the classifications, the importance of the attributes used in the study was made with random forest algorithm. Random forest algorithm is used to predict the importance of attributes as well as predictions. With the necessary calculations made on the trees that make up the random forest, the importance of the attribute is determined by looking at the increase in prediction error of the model. MMSE test, age and body mass index were determined as the three most important features for the study.After the measurements, approximately 79% accuracy, 26% recall, 74% false negative rate and over 99% specificity values were obtained for MMSE test. In the first classification applied to the data set, all models were more successful than MMSE test for accuracy, sensitivity and false negative rate measurements. Support vector machines algorithm had an accuracy of over 85%, recall over 58%, specificity over 95% and false negative rate of approximately 42%. The k-closest neighbor algorithm had approximately 85% accuracy, recall above 60%, specificity above 94% and false negative rate of approximately 40%. Random forest algorithm had approximately 85% accuracy, approximately 58% recall, approximately 95% specificity and false negative rate of approximately 42%. These three methods have been more successful than the decision tree algorithm.When the average success of the models on the data set divided by age and education were examined, the accuracy was reached to approximately 89% with k-nearest neighborhood, support vector machines and random forest, and the accuracy of decision trees was reached to 88%. The k-nearest neighborhood algorithm had recall over 60%, specificity over 96%, and false negative rate of about 40%. Support vector machines algorithm had recall over 59%, specificity over 96%, and false negative rate of approximately 41%. Random forest algorithm had recall over 58%, specificity over 96% and false negative rate of approximately 42%. These methods were again the most successful methods.In this study, considering the accuracy, sensitivity and false negative rate, it was observed that all classifiers used increased the success of MMSE neuropsychological test. In particular, k-nearest neighbor, support vector machines and random forest algorithms have been the three most successful methods, performing close to each other. In the data set divided by age and education, the performance of the models increased further and k-nearest neighborhood, support vector machines and random forest algorithms became again the most successful methods. It has been found that these methods can be used to improve accuracy, sensitivity and false negative rate for more successful dementia prediction by MMSE neuropsychological test.
Collections