This project focuses on visual learning with ambiguity. In particular, we have applied and extended the Multiple Instance Learning (MIL) paradigm to challenging computer vision problems. We propose two novel learning frameworks: Multiple Component Learning (MCL) for part-based object detection, and Multiple Pose Learning (MPL) for simultaneously clustering data and training discriminative classifiers for each cluster. The latter is a type of alignment that is complementary to Multiple Instance Learning.

Multiple Component Learning

Object detection is one of the key problems in computer vision. In the last decade, discriminative learning approaches have proven effective in detecting rigid objects, achieving very low false positives rates. The field has also seen a resurgence of part-based recognition methods, with impressive results on highly articulated, diverse object categories. In this paper we propose a discriminative learning approach for detection that is inspired by part-based recognition approaches. Our method, Multiple Component Learning (MCL), automatically learns individual component classifiers and combines these into an overall classifier. Unlike previous methods, which rely on either fairly restricted part models or labeled part data, MCL learns powerful component classifiers in a weakly supervised manner, where object labels are provided but part labels are not. The basis of MCL lies in learning a set classifier; we achieve this by combining boosting with weakly supervised learning, specifically the Multiple Instance Learning framework (MIL). MCL is general, and we demonstrate results on a range of data from computer audition and computer vision. In particular, MCL outperforms all existing methods on the challenging INRIA pedestrian detection dataset, and unlike methods that are not part-based, MCL is quite robust to occlusions.


Figure 1: Response of first 5 learned components classifiers on randomly selected INRIA pedestrian test images. At most one box is displayed per component after non-maximal suppression and thresholding. Three components correspond to semantically meaningful parts (head-magenta, left foot-red, right foot-yellow); 2 correspond to the region between between the legs. The components were learned with no component labels provided during training.

Multiple Pose Learning

In object recognition in general and in face detection in particular, data alignment is necessary to achieve good classification results with certain statistical learning approaches such as Viola-Jones. Data can be aligned in one of two ways: (1) by separating the data into coherent groups and training separate classifiers for each; (2) by adjusting training samples so they lie in correspondence. If done manually, both procedures are labor intensive and can significantly add to the cost of labeling. In this paper we present a unified boosting framework for simultaneous learning and alignment. We present a novel boosting algorithm for Multiple Pose Learning (mpl), where the goal is to simultaneously split data into groups and train classifiers for each. We also review Multiple Instance Learning (MIL), and in particular mil-boost, and describe how to use it to simultaneously train a classifier and bring data into correspondence. We show results on variations of LFW and MNIST, demonstrating the potential of these approaches.


Figure 2: We present two strategies for simultaneous learning and alignment. Data can be aligned by: (1) separating the data into coherent groups and training separate classifiers for each (MPL); (2) adjusting training samples so they lie in correspondence (MIL).

Related Publications

Multiple Component Learning for Object Detection

Piotr Dollár, Boris Babenko, Serge Belongie, Pietro Perona, Zhuowen Tu

ECCV 2008, Marseille, France.

[pdf] [bibtex] [poster]

Simultaneous Learning and Alignment: Multi-Instance and Multi-Pose Learning

Boris Babenko, Piotr Dollár, Zhuowen Tu, Serge Belongie

ECCV 2008: Faces in Real-Life Images, Marseille, France.

* Earlier version appeared as Technical Report CS2008, UCSD (2008)

[pdf] [bibtex] [video] [poster]

Multiple Instance Learning with Query Bags

Boris Babenko, Piotr Dollar, Serge Belongie

UCSD CSE Tech Report CS2009-0949, September 2009

[pdf]

Multiple Instance Learning: Algorithms and Applications

Boris Babenko

Research Exam, 2008, UCSD Computer Science and Engineering Department

[pdf]


Code

Coming soon...



Copyright Boris Babenko 2008