AI Seminar

Learning and Searching Methods for Robust, Real-Time Visual Odometry

Andrew Richardson

Accurate position estimation provides a critical foundation for mobile robot perception and control. While well-studied, it remains difficult to provide timely, precise, and robust position estimates for applications that operate in uncontrolled environments, such as robotic exploration and automated driving. Continuous, high-rate egomotion estimation is possible using cameras and Visual Odometry, which tracks the movement of sparse scene content known as image keypoints or features. However, high update rates, often 30 Hz or greater, leave little computation time per cycle, while variability in scene content stresses robustness. Due to these challenges, implementing an accurate and robust visual odometry system remains difficult.

This thesis investigates fundamental improvements throughout all stages of a visual odometry system, and has three primary contributions: The first contribution is a machine learning method for feature detector design. This method considers end-to-end motion estimation accuracy during learning. Consequently, accuracy and robustness are improved across multiple challenging datasets in comparison to state of the art alternatives. The second contribution is a proposed feature descriptor, TailoredBRIEF, that builds upon recent advances in the field in fast, low-memory descriptor extraction and matching. This thesis proposes an in-situ descriptor learning method that improves feature matching accuracy by efficiently customizing descriptor structures on a per-feature basis. Further, a common asymmetry in vision system design between reference and query images is described and exploited, enabling approaches that would otherwise exceed runtime constraints. The final contribution is a new algorithm for visual motion estimation: Perspective Alignment Search (PAS). Many vision systems depend on the unique appearance of features during matching, despite a bounty of non-unique features in otherwise barren environments. A search-based method, PAS, is proposed to employ features that lack unique appearance through descriptorless matching. This method simplifies visual odometry pipelines, defining one method that subsumes feature matching, outlier rejection, and motion estimation.

Throughout this work, evaluations of the proposed methods and systems are carried out on ground-truthed datasets, often generated with custom experimental platforms in challenging environments. Particular focus is placed on preserving runtimes compatible with real-time operation, as is necessary for deployment in the field.

Sponsored by

Professor Edwin Olson