Fovea based coding for video streaming

Dikici, Çağatay

View/Open

File_169681 (4.192Mb)

Date

2004

Author

Dikici, Çağatay

Metadata

Show full item record

Abstract

ÖZET İLGİ TABANLI VIDEO KODLAMA VE İLETİMİ İlgi tabanlı robotlar, insana benzer görüden esinlenerek, fovea-çevre ayrımına ve hızlı göz hareketleriyle odaklanma yeteneğine sahip olması gerekmektedir. Bu sonuçla, ardışık imgelerin içinde her yeni gelen çerçeve düzgün olmayan örnekleme, ve ardarda gelen her çerçeve de zamansal artıklık içerir. Bu tez çalışmasında, bu iki belirleyici niteliği kullanarak düşük bant genişliğine sahip ağlar için yeni bir video kodlama ve duraksız iletim algoritması öneriyoruz. Deneysel sonuçlarımız, uzaktan robot erişimi gibi duraksız video iletim uygulamalarındaki iyileştirmeyi ortaya çıkartmaktadır. Bunun yanında, ilgi kriterini kullanarak, sahnedeki en ilginç bölgeye odaklanabilen ilgi tabanlı bir iskelet sistem sunulmaktadır. Bu ilgi tabanlı fonksiyon, kullanılan ilkel görme serisine göre değişiklik gösterebilir. Biz bu çalışmamızda, Kartezyen ve kartezyen olmayan süzgeçlerin insan yüzü bulunması işlemindeki kullanılabilirliğini gösterdik. Algoritmamız insan görü sisteminin Gauss-benzeri çözünürlüğünü kullandığı ve standart kodlama yöntemleri ile çok kolay bir şekilde tümleştirilebildiği için, cep telefonlarından video yayını gibi uygulamalarda da kullanılabilir.

IV ABSTRACT FOVEA BASED CODING FOR VIDEO STREAMING Attentive robots, inspired by human-like vision - are required to have visual systems with fovea-periphery distinction and saccadic motion capability. Thus, each frame in the incoming image sequence has nonuniform sampling and consecutive sac cadic images have temporal redundancy. In this thesis, we propose a novel video coding and streaming algorithm for low bandwidth networks that exploits these two features simultaneously. Our experimental results reveal improved video streaming in applica tions like robotic teleoperation. Furthermore, we present a complete framework for foveating to the most interest ing region of the scene using attention criteria. The construction of this function can vary depending on the set of visual primitives used. In our case, we show the feasibility of using Cartesian and Non-Cartesian filters for the case of human-face videos. Since the algorithm is predicated on the Gaussian- like resolution of human visual system and is extremely simple to integrate with the standard coding schemes, it can also be used in applications such as cellular phones with video.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/77425

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/embargoedAccess