Segmentation, feature extraction and recognition of ottoman script

Atici, Alper

View/Open

File_35549 (2.480Mb)

Date

1994

Author

Atici, Alper

Metadata

Show full item record

Abstract

ÖZ OSMANLICA METİNİN BÖLÜTLENMESİ, ÖZNİTELİK ÇIKARIMI VE TANINMASI ATICI, Alper Yüksek Lisans Tezi, Bilgisayar Mühendisliği Anabilim Dalı Tez Yöneticisi: Prof. Dr. Fatoş Yarman VURAL Eylül, 1994 Bu tezde, C++ programlama dili kullanılarak, PC tabanlı bir donanım üzerinde Osmanlıca metinin bölütlenmesi, öznitelik çıkarımı ve tanınması amacıyla bir optik harf tanıma sistemi geliştirilmiştir. Geometrik ve topolojik öznitelik analizine dayalı bölütleme ve öznitelik çıkarımı ardından, harflerin ana gövdelerine zincir kodu transformasyon uygulanmaktadır. Bölütleme aşaması ile elde edilen bölütler, farklı yöntemlerle işlenebilirler. Zincir kodu dizilimleri Saklı Markov Model ile sınıflandırılmaktadır Elde edilen öznitelikler, trie' veri yapısı ağacı ile temsil edilen alfabede tam eşleme yöntemiyle işlenerek harf tanıması yapılmaktadır. 'Trie' düğümleri, harfleri ayırdedici özniteliklere karşılık gelmektedir. Saklı Markov Model ile yapılan deneylerde ilk iki sırada görülen doğru seçeneklerin oram %99'dur.Anahtar Sözcükler: optik harf tanıma, bölütleme, öznitelik analizi ve çıkarımı, sınıflandırma, Saklı Markov Model, trie, zincir kodu. Bilim Dalı Sayısal Kodu : 619.02.05 VI

ABSTRACT SEGMENTATION, FEATURE EXTRACTION AND RECOGNITION OF OTTOMAN SCRIPT ATICI, Alper M.S. in Computer Engineering Supervisor: Prof. Dr. Fatos Yarman VURAL September, 1994 In this study, an OCR system for segmentation, feature extraction and recognition of Ottoman script has been developed using C++ programming language on PC platform. The segmentation and feature extraction stages are based on geometrical and topological feature analysis, followed by the chain code transformation of the main strokes of characters. The output of segmentation is well- defined segments that can be fed into any classification approach. The classes of main strokes are identified through left-right Hidden Markov Models (HMM). Final decision is made by exact matching of the features with those of the alphabet, which is represented by tries. The nodes of the tries represent distinguishing features of characters. The percentage of correct decisions that appear in top 2 alternatives produced by Hidden Markov Models is 99%. UlKeywords: Optical Character Recognition (OCR), segmentation, feature analysis and extraction, classification, Hidden Markov Models (HMM), trie, chain code. Science Code : 619.02.05 IV

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/265457

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess