Sentiment analysis in Iraqi Arabic dialects based on distributed representations of sentences and machine learning approach

Alnawas, Anwar Adnan Mzher

dc.contributor.advisor	Arıcı, Nursal
dc.contributor.advisor	Suçin, Mehmet Hakkı
dc.contributor.author	Alnawas, Anwar Adnan Mzher
dc.date.accessioned	2020-12-10T12:52:16Z
dc.date.available	2020-12-10T12:52:16Z
dc.date.submitted	2019
dc.date.issued	2020-01-22
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/296324
dc.description.abstract	Duygu Analizi, hesaplamalı dilbilimi ve veri madenciliği içinde yer alan bilgisayar bilimlerinin bir alt disiplindir. Duygu analizinin amacı, kişilerin veya toplulukların bir konu hakkındaki duygu ve düşüncelerinin metinsel dökümanlardan çıkarılmasıdır. Son yıllarda araştırmacılar için ilginç bir araştırma konusu haline gelen duygu analizi alanında bilimsel literatürde İngilizce için birçok çalışma bulunmaktadır. Bununla birlikte, Arapça için henüz çok fazla çalışma yayınlanmamıştır. Arapça; konuşmacıların sayısı, tarihi ve dini miras açısından önemli bir dildir. Arapçada resmi dil, klasik ve modern standart Arapçadan oluşur. Klasik Arapça, Kuran dilini temsil eder. Modern Standart Arapça, haber bültenlerinde ve eğitimde kullanır. İnternette Arapça kullanımı giderek artmakla birlikte, sosyal ağ ortamlarında bu iki tür kullanılmaz. Günlük pratik hayatta kullanılan yerel lehçeler daha çok tercih edilir. Bu nedenle, lehçelere dayalı Arapça içerikli metinlerden Duygu Analizi çalışmaları gittikçe önem kazanan araştırma konularından biridir. Bu doktora tezinde, Arap Irak lehçesinde Duygu Analizi çalışması gerçekleştirilmektedir. Çalışmanın ilk aşamasında üç tür veri kümesini toplanmıştır. Bunlar: önceki çalışmalardan sınıflandırılmış veri setleri, sınıflandırılmamış Irak Arapça lehçesi ve sınıflandırılmış Irak Arapça lehçesidir. İkinci aşama ön işleme aşamasıdır. Bu aşamada, karmaşıklığı en aza indirmek ve metin biçimini standartlaştırmak için veri kümelerinden gereksiz terimler ortadan kaldırılmıştır. Üçüncü aşamada, özelliklerin çıkarılması ve bir kelimeyi Doc2Vec modelini kullanarak vektör olarak temsil edilmesi sağlanmıştır. Dördüncü aşamada, bir duygu tahmin modeli oluşturmak için oluşturulan vektörler dört makine öğrenme algoritmasıyla eğitilmiştir. Beşinci aşamada, duygu tahmin modeli değerlendirilmiştir. Ayrıca deneysel çalışmada, değişken parametrelerin külliyat (derlem) sınıflandırma performansına etkisi de incelenmiştir.
dc.description.abstract	Sentiment analysis is a sub-discipline of computer science involved in computational linguistics and data mining. The purpose of Sentiment analysis is the inference of individuals 'and communities' feelings and thoughts about a topic from textual documents. In the field of Sentiment analysis, which has become an interesting research topic for researchers in recent years, there are many studies on English in the scientific literature. However, not enough studies have yet been published on Arabic. Arabic; it is an important language in terms of number of speakers, history, and religious heritage. The official language in Arabic consists of classical and modern standard Arabic. Classical Arabic represents the language of the Qur'an. Modern Standard Arabic is used in newsletters and education. Although the use of Arabic on the Internet is increasing, these two types are not used in social networking environments. Local dialects used in daily practice are more preferred. Therefore Sentiment Analysis of the Arabic texts based on dialects, is an important research topic. In this doctoral dissertation, Sentiment Analysis is conducted in the Arabic-Iraqi dialect. In the first stage of the study, three types of data were collected. These are: data sets classified from previous studies, unclassified Iraqi Arabic dialect and classified Iraqi Arabic dialect. The second stage is the pre-processing stage. At this stage, unnecessary terms from the datasets have been eliminated to minimize complexity and standardize text format. In the third stage, features were extracted to represent a word as a vector using Doc2Vec model. In the fourth step, the vectors created were trained through four machine learning algorithms to create a sentiment estimation model. Lastly, the sentiment predictive model was evaluated. Moreover, at the experimental phase, the effects of variable parameters and the background corpora on classification performance was evaluated.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Sentiment analysis in Iraqi Arabic dialects based on distributed representations of sentences and machine learning approach
dc.title.alternative	Cümlelerin dağıtılmış temsilleri ve makine öğrenmesi yaklaşımına dayalı Irak lehçelerinde duygu analizi
dc.type	doctoralThesis
dc.date.updated	2020-01-22
dc.contributor.department	Bilgisayar Mühendisliği Anabilim Dalı
dc.subject.ytm	Text mining
dc.subject.ytm	Opinion mining
dc.subject.ytm	Data mining
dc.identifier.yokid	10244547
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	GAZİ ÜNİVERSİTESİ
dc.identifier.thesisid	560656
dc.description.pages	100
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10244547.pdf
Size:: 5.150Mb
Format:: PDF
Description:: File_10244547

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess