Neural dependency parsing for Turkish

Tuç, Salih

View/Open

File_10338604 (1.813Mb)

Date

2020

Author

Tuç, Salih

Metadata

Show full item record

Abstract

Bağlılık ayrıştırma, sözcükler arasındaki sözdizimsel ve anlamsal ilişkilerin belirlenerekdilbilgisel yapıların ortaya çıkarılmasını içerir. Uzun dönemli bağlılıkların çıkarılması vesözlük dışı sözcüklerin meydana getirdiği sorunlardan ötürü bağlılık ayrıştırmada henüz iste-nen başarı elde edilememiştir. Mevcut sorunlar Türkçe iiçin de geçerli olup, özellikle sondaneklemeli yapısı gereği sözlük dışı sözcük oranı diğer dillere göre nispeten daha fazladır.Şimdiye kadar yapılan çalışmalar, Tekrarlı Sinir Ağlarının uzun dizilerde başarılı olamadığınıgöstermiştir. Bu tezde önerdiğimiz nöral model, kodlayıcı-kod çözücü yaklaşımını, Trans-former ağı üzerine kurulu bir kodlayıcı ve Yığıt Tabanlı İşaretçi Ağı üzerine kurulu bir kodçözücüyle gerçekleştirmektedir. Sözlük dışı sözcük problemi için ise sözcüklerin karak-ter tabanlı österimleri kullanılmaktadır. hem Türkçe, hem de İngilizce için gerçekleştirilendeneyler, önerilen modelin diğer nöral modellere göre özellikle uzun cümlelerde daha başarılıolduğunu ve uzun dönemli bağlılıkları etkili bir şekilde bulabildiğini göstermektedir.

Dependency Parsing is the task of finding the grammatical structure of a sentence by identify-ing syntactic and semantic relationships between words. The current accuracy of the depen-dency parsers is still not satisfying due to the long term dependencies and out-of-vocabulary(OOV) problem. Those problems also apply to Turkish because of the high percentage ofOOV words due to its agglutinative morphological structure compared to other languages.The recent work shows that Recurrent Neural Networks (RNNs) are not efficient for longsequences. The deep neural architecture that we propose in this thesis follows an encoder-decoder structure with an encoder based on a Transformer Network and a decoder basedon a Stack Pointer Network. The character-level word embeddings are also integrated in themodel to cope with the OOV problem. The results for both Turkish and English show that theproposed model performs better for long sentences and can identify long term dependenciesmore efficiently compared to other neural models.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/472871

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess