Approximate optimal control model for visual search tasks

Acharya, Aditya (2019). Approximate optimal control model for visual search tasks. University of Birmingham. Ph.D.

Preview

Acharya2019PhD.pdf
Text - Accepted Version
Available under License All rights reserved.
Download (1MB) | Preview

Abstract

Visual search is a cognitive process that makes use of eye movements to bring the relatively high acuity fovea to bear on areas of interest to aid in navigation or interaction within the environment. This thesis explores a novel hypothesis that human visual search behaviour emerges as an adaptation to the underlying human information processing constraint, task utility and ecology. A new computational model (Computationally Rational Visual Search (CRVS) model) for visual search is also presented that provides a mathematical formulation for the hypothesis. Through the model, we ask the question, what mechanism and strategy a rational agent would use to move gaze and when should it stop searching?

The CRVS model formulates the novel hypothesis for visual search as a Partially Observable Markov Decision Process (POMDP). The POMDP provides a mathematical framework to model visual search as a optimal adaptation to both top-down and bottom-up mechanisms. Specifically, the agent is only able to partially observe the environment due to the bounds imposed by the human visual system. The agent learns to make a decision based on the partial information it obtained and a feedback signal. The POMDP formulation is very general and it can be applied to a range of problems. However, finding an optimal solution to a POMDP is computationally expensive. In this thesis, we use machine learning to find an approximately optimal solution to the POMDP. Specifically, we use a deep reinforcement learning (Asynchronous Advantage Actor-Critic) algorithm to solve the POMDP.

The thesis answers the where to fixate next and when to stop search questions using three different visual search tasks. In Chapter 4 we investigate the computationally rational strategies for when to stop search using a real-world search task of images on a web page. In Chapter 5, we investigate computationally rational strategies for where to look next when guided by low-level feature cues like colour, shape, size. Finally, in Chapter 6, we combine the approximately optimal strategies learned from the previous chapters for a conjunctive visual search task (Distractor-Ratio task) where the model needs to answer both when to stop and where to search question.

The results show that visual search strategies can be explained as an approximately optimal adaptation to the theory of information processing constraints, utility and ecology of the task.

Type of Work:

Thesis (Doctorates > Ph.D.)

Award Type:

Doctorates > Ph.D.

Supervisor(s):