Options
Computer models of dynamic visual attention
Auteur(s)
Bur, Alexandre
Editeur(s)
Hügli, Heinz
Date de parution
2009
Mots-clés
Résumé
To perceive the environment efficiently, the human vision system proceeds by selecting salient targets. The targets are explored successively by means of saccadic eye movements, which are responsible for shifting the fovea onto the current fixated target. Defined as selective attention, this mechanism can be seen as a preprocessing step, which reduces the amount of information that will be processed later by the brain. A topic in computer vision deals with the modeling of visual attention. While most investigations concentrate on static computer model, i.e. a computer system selecting salient targets from a still image, only some recent works deal with dynamic computer model, i.e. a computer system selecting salient targets from video streams. Such a paradigm is an attractive solution to reduce complexity issues in computer vision applications. Extending such a computer system to video sequences will lead to promising perspectives. Given the importance of video sequences today, the application potential is huge and covers domains like video compression, video quality assessment, mobile robot navigation, monitoring and surveillance. The purpose of such model is to provide an automatic selection of potential regions of interest all over the sequence duration. The selection process relies on motion as well as static feature contrasts. It encompasses the feature extraction from the video sequence and its integration in a competitive way to define the resulting saliency map. This scalar map indicates salient locations, in the form of a saliency distribution. At the end, most salient regions of interest are defined from the saliency map using a selection process based on a neural network. This thesis investigates the design of dynamic computer VA modeling, which relies on three main axes: (i) the static model, (ii) the motion model, and (iii) the map integration scheme to fuse both static and motion channels. First, the static model relies extensively on previous works that have reported impressive findings on biological and artificial visual attention. The proposed static model shares similar concepts, with some improvements regarding the feature integration strategies. Second, the design of the motion model is discussed. Research in neuroscience provides plausible hypothesis on motion analysis in the human brain. These mechanisms provide the core of the computer model. We present several computer models highlighting motion contrasts of different nature. Two novel approaches are proposed, namely the vector model which high-lights relative motion contrast, and the phase & magnitude model which decouples phase and magnitude contrasts. Third, the integration of the static and motion models is discussed. Several motion integration strategies are presented, including the novel motion priority scheme as alternative to the classical competitive scheme. Finally, psycho-physical experiments are used to evaluate the performances of the proposed models. The experimental frame consists in showing a set of video sequences to a population of human subjects, while an eye tracker system is recording their eye movement patterns. The experimental data are then compared to the prediction of the computer models for the purpose of qualitative and quantitative evaluations. The set of video sequences includes various categories (synthetic and real scenes, fixed and moving background), showing advantages and inconveniences of each model.
Notes
Thèse de doctorat : Université de Neuchâtel, 2009 ; Th. 2080
Identifiants
Type de publication
doctoral thesis
Dossier(s) à télécharger