My research interests include machine vision & learning. More specifically I am interested in learning with ambiguity (multiple instance learning), learning distance metrics, part based object detection & recognition, local feature description, and online/adaptive learning for tracking.
I recently gave a Google Tech Talk summarizing the last three years of my work. This is a great review of all many of the projects below. The slides are available here.
Tracking with Online Multiple Instance Learning (MILTRACK)
In this project, we address the problem of learning an adaptive appearance model for object tracking. In particular, a class of tracking techniques called ``tracking by detection'' have been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrades the classifier and can cause further drift. We show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems, and can therefore lead to a more robust tracker with fewer parameter tweaks. We present a novel online MIL algorithm for object tracking that achieves superior results with real-time performance.
This project focuses on visual learning with ambiguity. In particular, we have applied and extended the Multiple Instance Learning (MIL) paradigm to challenging computer vision problems. We propose two novel learning frameworks: Multiple Component Learning (MCL) for part-based object detection, and Multiple Pose Learning (MPL) for simultaneously clustering data and training discriminative classifiers for each cluster. The latter is a type of alignment that is complementary to Multiple Instance Learning.
In this project we trained a distance function for local region matching. We show that when the application is relatively constrained, using a supervised learning approach produces better results than a generic system that was tuned by hand.