This project focuses on visual learning with ambiguity. In particular, we have applied and extended the Multiple Instance Learning (MIL) paradigm to challenging computer vision problems. We propose two novel learning frameworks: Multiple Component Learning (MCL) for part-based object detection, and Multiple Pose Learning (MPL) for simultaneously clustering data and training discriminative classifiers for each cluster. The latter is a type of alignment that is complementary to Multiple Instance Learning.
Multiple Component Learning
Object detection is one of the key problems in computer vision.
In the last decade, discriminative learning approaches have proven
effective in detecting rigid objects, achieving very low false positives
rates. The field has also seen a resurgence of part-based recognition
methods, with impressive results on highly articulated, diverse object
categories. In this paper we propose a discriminative learning approach
for detection that is inspired by part-based recognition approaches. Our
method, Multiple Component Learning (MCL), automatically learns individual
component classifiers and combines these into an overall classifier.
Unlike previous methods, which rely on either fairly restricted part models
or labeled part data, MCL learns powerful component classifiers in a
weakly supervised manner, where object labels are provided but part labels
are not. The basis of MCL lies in learning a set classifier; we achieve
this by combining boosting with weakly supervised learning, specifically
the Multiple Instance Learning framework (MIL). MCL is general, and
we demonstrate results on a range of data from computer audition and
computer vision. In particular, MCL outperforms all existing methods on
the challenging INRIA pedestrian detection dataset, and unlike methods
that are not part-based, MCL is quite robust to occlusions.
Figure 1: Response of first 5 learned components classifiers on randomly selected INRIA
pedestrian test images. At most one box is displayed per component
after non-maximal suppression and thresholding. Three components correspond
to semantically meaningful parts (head-magenta, left foot-red, right foot-yellow); 2 correspond
to the region between between the legs. The components were learned with no
component labels provided during training.
|
Multiple Pose Learning
In object recognition in general and in face detection in particular,
data alignment is necessary to achieve good classification results
with certain statistical learning approaches such as Viola-Jones. Data can
be aligned in one of two ways: (1) by separating the data into coherent
groups and training separate classifiers for each; (2) by adjusting training
samples so they lie in correspondence. If done manually, both procedures
are labor intensive and can significantly add to the cost of labeling. In this
paper we present a unified boosting framework for simultaneous learning
and alignment. We present a novel boosting algorithm for Multiple
Pose Learning (mpl), where the goal is to simultaneously split data into
groups and train classifiers for each. We also review Multiple Instance
Learning (MIL), and in particular mil-boost, and describe how to use it
to simultaneously train a classifier and bring data into correspondence.
We show results on variations of LFW and MNIST, demonstrating the
potential of these approaches.
Figure 2: We present two strategies for simultaneous learning and alignment. Data can be
aligned by: (1) separating the data into coherent groups and training separate classifiers
for each (MPL); (2) adjusting training samples so they lie in correspondence (MIL).
|
Related Publications
Multiple Component Learning for Object Detection
Piotr Dollár, Boris Babenko, Serge Belongie, Pietro Perona, Zhuowen Tu
ECCV 2008, Marseille, France.
[pdf]
[bibtex]
[poster]
|
Simultaneous Learning and Alignment: Multi-Instance and Multi-Pose Learning
Boris Babenko, Piotr Dollár, Zhuowen Tu, Serge Belongie
ECCV 2008: Faces in Real-Life Images, Marseille, France.
* Earlier version appeared as Technical Report CS2008, UCSD (2008)
[pdf]
[bibtex]
[video]
[poster]
|
Multiple Instance Learning with Query Bags
Boris Babenko, Piotr Dollar, Serge Belongie
UCSD CSE Tech Report CS2009-0949, September 2009
[pdf]
|
Multiple Instance Learning: Algorithms and Applications
Boris Babenko
Research Exam, 2008, UCSD Computer Science and Engineering Department
[pdf]
|
Code
Coming soon...
|