Turkish text to speech using children`s voices syllables

Erdoğan, Yoldaş

dc.contributor.advisor	Tüfekci, Zekeriya
dc.contributor.author	Erdoğan, Yoldaş
dc.date.accessioned	2020-12-07T09:43:25Z
dc.date.available	2020-12-07T09:43:25Z
dc.date.submitted	2019
dc.date.issued	2019-10-21
dc.identifier.uri	https://acikbilim.yok.gov.tr/handle/20.500.12812/125656
dc.description.abstract	Metinden Konuşma Sentezleme (MKS) kısaca yazılı haldeki bir metnin elektronik ortama aktarılarak ses sinyallerine dönüştürülmesi demektir. Bu yazılı metin bir belge veya elektronik kitap da olabilir, bir web sayfası da olabilir. İdeal bir TTS sisteminden insanın okuyabildiği her metni doğal insan sesi gibi işleyebilmesi beklenir. Ülkemizde metinden konuşma sentezleme çalışmaları daha çok yetişkin kadın ve erkek seslerinin üretilmesine yoğunlaşmıştır. Bu tezde ise çocuk seslerinden oluşan bir ses veritabanı tasarlanmış ve sentezlenecek sesin çocuk sesi olması hedeflenmiştir.Ses sentezleme çalışmalarında doğallığa en yakın sesin, eklemeli (concatenative) ses sentezleme yöntemleri ile sağlandığı görülmüştür. Bu tez kapsamında ses verisi olarak ikili heceyi kullanan ve eklemeli sentezleme yöntemine dayanan bir metin seslendirme sistemi gerçeklenmiştir. Metinden konuşma sinyali oluşturma genel olarak iki ana bölümden oluşmaktadır. Birinci bölümde sentezlenecek metin, dil kurallarına uygun olarak normalize edilmekte ve hecelerine ayrılmaktadır. Tasarlanan system için bir heceleme algoritması geliştirilmiş ve girilen metnin hecelerine ayrılması sağlanmıştır. İkinci bölümde ise ses hece sinyalleri işlenerek bir araya getirilmekte ve konuşma sentezleme işlemi gerçekleştirilmektedir. Ses sinyallerinin işlenmesinde farklı teknikler bulunmakla beraber bu tez çalışmasında SOLA(Synchronous Overlap and Add) yöntemi temel alınarak ses sinyalleri uzatılmakta ve kısaltılmaktadır.Sistem, girişte aldığı metin bilgisinden heceleri oluşturur. Üçlü heceleri ikili hecelerden üretilecek şekle getirir. Daha sonra bu hecelere ait ses dosyalarını kullanarak ikili veye tekli heceleri kayıtlı oldukları dosyalardan alır ve belirli algoritmalar dahilinde birleştirir. Bu aşamada hecelerin birleştiği yerlerde seslerin türlerine göre belirlenen kurallar uygulanır ve gerçek ses dosyalarındaki doğallık elde edilmeye çalışılır. Bu doğallık gerekli yerlerde hecelerin başında ya da sonunda uzatma ve kısaltma yapılarak sağlanmaya çalışılmıştır.Sistem basit teknikler kullanıyor olmasına rağmen, seçilen eklemeli method Türkçe'nin yapısına çok uygun olduğu için verimli sonuçlar üretmektedir.
dc.description.abstract	Text to speech (TTS) shortly means to convert a written text into audio signals electronically. This written text may be a text document, electronic book, or a web page. An ideal TTS system is expected to be able to process every readable text in the quality of natural human voice. In our country, text to speech studies mostly focus on the production of adult male and female voices. In this thesis, an audio database consisting of children's voices was designed so the synthesized sound is aimed to be children's voices.In voice synthesis studies, it is seen that the closest sound to naturalness was provided by concatenative voice synthesis methods. Within the scope of this thesis, a TTS system that is based on additive synthesis technique which uses binary syllable as the length of voice unit is implemented. In general, conversion of text to audio signal process consists of two main parts. In the first part, the text to be synthesized is normalized according to language rules and is divided into syllables. A hyphenation algorithm is developed for the designed system and the entered text was separated into syllables. In the second part, audio syllable signals are processed and merged so that the speech synthesizing process is performed. Although there are different techniques in processing the audio signals, they are extended and shortened based on the Synchronous Overlap and Add (SOAP) method in this thesis.The system generates syllables from the text information it receives as an input. It makes triple syllables to be produced from double syllables. Then, by using the audio files belonging to these syllables, syllables are taken from the recorded files and began to be merged. At this stage, rules determined according to the types of sounds are applied at the junction points of syllables and naturalness is tried to be created similar to the waveforms in real sound files. This naturalness has been tried to be provided by extending and shortening the beginning or end of syllables where necessary.Although the system uses simple techniques, the selected additive method is very suitable for the structure of Turkish and so produces efficient results.	en_US
dc.language	English
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Attribution 4.0 United States	tr_TR
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	tr_TR
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.subject	Elektrik ve Elektronik Mühendisliği	tr_TR
dc.subject	Electrical and Electronics Engineering	en_US
dc.title	Turkish text to speech using children`s voices syllables
dc.title.alternative	Çocuk ses heceleri kullanarak türkçe metinden konuşma seslendirme
dc.type	masterThesis
dc.date.updated	2019-10-21
dc.contributor.department	Elektrik-Elektronik Mühendisliği Anabilim Dalı
dc.subject.ytm	Computer softwares
dc.subject.ytm	Speech synthesis
dc.identifier.yokid	10286343
dc.publisher.institute	Fen Bilimleri Enstitüsü
dc.publisher.university	ÇUKUROVA ÜNİVERSİTESİ
dc.identifier.thesisid	570372
dc.description.pages	146
dc.publisher.discipline	Diğer

Files in this item

Name:: yokAcikBilim_10286343.pdf
Size:: 3.769Mb
Format:: PDF
Description:: File_10286343

View/Open

This item appears in the following Collection(s)

TEZLER

Show simple item record

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess