Word-based compression in full-text retrienal systems

Selçuk, Ali Aydin

View/Open

File_46039 (1.820Mb)

Date

1995

Author

Selçuk, Ali Aydin

Metadata

Show full item record

Abstract

ÖZET TAM METİN ERİŞİM SİSTEMLERİNDE KELİME TABANLI SIKIŞTIRMA Ali Aydın Selçuk Endüstri Mühendisliği Bölümü Yüksek Lisans Tez Yöneticisi: Prof. Dr. M. Akif Eyler Mayıs, 1995 Tam metin erişim sistemlerinin büyük yer ihtiyaçları veri sıkıştırma ile büyük ölçüde azaltılabilinir. Bu çalışmada bir tam metin erişim sisteminin metin veritabanının sıkıştırılması problemi incelenmiş, ve ana metnin sıkıştırıl ması için değişik kodlama tekniklerinin performansları karşılaştırılmıştır. Yapı lan deneyler uygulanan metodlar arasında en iyi sıkıştırmanın Huffman kod- laması ve aritmetik kodlama gibi istatistiksel teknikler tarafından sağlandığını göstermiştir. Anahtar kelimeler: Tam metin erişimi, Veri sıkıştırma, Metin sıkıştırma, Kelime tabanlı modelleme iv

ABSTRACT WORD-BASED COMPRESSION IN FULL-TEXT RETRIEVAL SYSTEMS Ali Ay dm Selçuk M.S. in Industrial Engineering Supervisor: Prof. M. Akif Eyler May, 1995 Large space requirement of a full-text retrieval system can be reduced sig nificantly by data compression. In this study, the problem of compressing the main text of a full-text retrieval system is addressed and performance of several coding techniques for compressing the text database is compared. Experiments show that statistical techniques, such as arithmetic coding and Huffman cod ing, give the best compression among the implemented; and using a semi-static word-based model, the space needed to store English text is less than one third of the original requirement. Key words: Full-text retrieval, Data compression. Text compression, Word- based model m

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/37322

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess