Perception, Memory, and the Field of View Problem

Abstract

Robust control of a vision-based agent requires tight coupling between sensing and action. For mobile robots performing visually-guided navigation, this means closed-loop control of motion with respect to sensed features, landmarks, or other relevant parts of the visible environment. Real vision sensors have limited fields of view. This makes true closed-loop control with respect to an arbitrary set of landmarks impossible with practical vision systems, since only a fraction of the environment can be seen at any one time. My dissertation describes a solution to the field of view problem for vision-based agents lacking omnidirectional sensors. I propose a unified object memory system which integrates short-term working memory of the local visual space with immediate object perceptions fl’om the real-time image stream. Short-term memory includes a model of agent and object dynamics and a recursive position estimator. The significance of unification of short-term memory and direct perception is twofold. First, since the positions and properties of all relevant objects are either directly sensed or estimated from recent experience, navigation control laws can be written as closed-loop controllers operating without regard to the agent’s current field of view. Second, given an estimator such as the Kalman filter which includes an estimate of state uncertainty, an independent investigatory action scheduler can dynamically optimize shifts of camera field of view. In existing closed-loop visual control (or visual servoing) systems, the agent’s plan contains explicit instructions for control of all actuators. The camera platform’s degrees of freedom are used to maintain visual lock on a target, and locomotor degrees of freedom are used to cause the robot to follow a particular path or perform an action based on its sensory input. Envisioned as a closed-loop control system, this arrangement has a real-time stream of pixels as its input, a vector of motor controls as its output, and a set of vision and control algorithms the plan, plus state and numerous transformation algorithms in between. If camera field-of-view changes are required, they must be explicitly programmed as a part of the plan. This can be awkward or impossible when multiple goals must be pursued simultaneously.

Cite

Text

Gribble. "Perception, Memory, and the Field of View Problem." AAAI Conference on Artificial Intelligence, 1998.

Markdown

[Gribble. "Perception, Memory, and the Field of View Problem." AAAI Conference on Artificial Intelligence, 1998.](https://mlanthology.org/aaai/1998/gribble1998aaai-perception/)

BibTeX

@inproceedings{gribble1998aaai-perception,
  title     = {{Perception, Memory, and the Field of View Problem}},
  author    = {Gribble, William S.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1998},
  pages     = {1173},
  url       = {https://mlanthology.org/aaai/1998/gribble1998aaai-perception/}
}