research

Object-Centered Visual Recognition

Papers

P. Dollár, Z. Tu, P. Perona and S. Belongie
Integral Channel Features
BMVC 2009, London, England. [pdf | poster | abstract | bibtex | addendum]

P. Dollár, S. Belongie and P. Perona
The Fastest Pedestrian Detector in the West
BMVC 2010, Aberystwyth, UK. [pdf | poster | abstract | bibtex]

P. Dollár, P. Welinder and P. Perona
Cascaded Pose Regression
CVPR 2010, San Francisco, CA. [pdf | poster | bibtex]

P. Dollár, R. Appel and W. Kienzle
Crosstalk Cascades for Frame-Rate Pedestrian Detection
ECCV 2012, Florence, Italy. [pdf | poster | bibtex]

Full benchmark results can be found at the Caltech Pedestrian Dataset.
I also gave a talk on object-centered visual recognition at MSR.

Code

Highly optimized code for computing multi-scale channel pyramids is now available as part of my Matlab toolbox. The feature computation runs at 30-100 fps on VGA images and gives state of the art results as described in our ECCV 2012 paper.

Detection Binaries for the BMVC 2009/2010 papers are also available (see detect.m for usage). Note: the code associated with our ECCV 2012 paper (see above) is much faster (but does not yet include object detection code).

If you are interested in obtaining our CPR Matlab code please contact us. The CPR code requires my basic toolbox. Updated Aug. 06, 2012, see readme.

Caltech Pedestrian Benchmark

peds01 peds02 peds04

Papers

P. Dollár, C. Wojek, B. Schiele and P. Perona
Pedestrian Detection: A Benchmark
CVPR 2009, Miami, Florida. [pdf | bibtex]

P. Dollár, C. Wojek, B. Schiele and P. Perona
Pedestrian Detection: An Evaluation of the State of the Art
PAMI, 2011. [pdf | bibtex]

I gave a talk on pedestrian detection at MSR in June 2010.

Dataset

The dataset, evaluation code and up-to-date results can be found on the Caltech Pedestrian Detection Benchmark project website.

Multiple Instance & Multiple Component Learning

Papers

P. Dollár, B. Babenko, S. Belongie, P. Perona and Z. Tu
Multiple Component Learning for Object Detection
ECCV 2008, Marseille, France. [pdf | bibtex]

B. Babenko, P. Dollár, Z. Tu and S. Belongie
Simultaneous Learning and Alignment: Multi-Instance and Multi-Pose Learning
ECCV 2008: Faces in Real-Life Images, Marseille, France. [pdf | bibtex]

B. Babenko, P. Dollár, and S. Belongie
Multiple Instance Learning with Query Bags
UCSD-TR 2009, CS2009-0949. [pdf | bibtex]

B. Babenko, N. Verma, P. Dollár, and S. Belongie
Multiple Instance Learning with Manifold Bags
ICML 2011, Bellevue, Washington. [pdf | bibtex]

Also here is the poster for MCL (ECCV08) and also for the associated workshop paper (ECCV08-WK), which should serve the role of an in depth reference to Multiple Instance Learning (MIL).

Non-Isometric Manifold Learning

Papers

Piotr Dollár, Vincent Rabaud and Serge Belongie
Learning to Traverse Image Manifolds
NIPS 19, 2006, Vancouver, B.C., Canada. [pdf | bibtex]

Longer version of NIPS work [extra 4 page appendix]:
Piotr Dollár, Vincent Rabaud and Serge Belongie
Learning to Traverse Image Manifolds
UCSD-TR 2007, CS2007-0876. [pdf | bibtex]

Piotr Dollár, Vincent Rabaud and Serge Belongie
Non-Isometric Manifold Learning: Analysis and an Algorithm
ICML 2007, Corvallis, Oregon. [pdf | bibtex]

D.S. Touretzky, A.S. Gupta, M.C. Fuhs, P. Dollár, A.P. Maurer, B.L. McNaughton
Reconstructing the Topologies of Hippocampal Cognitive Maps
SFN 2007, San Diego, CA. [abstract]

The following posters (NIPS06 and ICML07) provide a good introduction to our work. Also, here are the slides from our ICML talk. And here is a video of the talk itself.

Code

If you are interested in obtaining our LSML Matlab code please contact us. The LSML code requires my basic toolbox. Updated as of Mar. 06, 2009, see readme.

Boundary & Feature Learning


The goal is simple: to learn edges and object boundaries from human labeled images while making few modeling assumptions. Some example training and testing images are given for a number of domains (click each icon to see enlarged corresponding images). We've extended these ideas to other domains including feature learning and brain segmentation.

Papers

Piotr Dollár, Zhuowen Tu and Serge Belongie
Supervised Learning of Edges and Object Boundaries
CVPR 2006, New York, New York. [pdf | bibtex]

Piotr Dollár, Zhuowen Tu, Hai Tao and Serge Belongie
Feature Mining for Image Classification
CVPR 2007, Minneapolis, Minnesota. [pdf | bibtex]

Boris Babenko, Piotr Dollár, and Serge Belongie
Task Specific Local Region Matching
ICCV 2007, Rio de Janeiro, Brazil. [pdf | bibtex]

Z. Tu, K.L. Narr, P. Dollár, I. Dinov, P.M. Thompson, and A.W. Toga
Brain Anatomical Structure Segmentation by Hybrid Discriminative/Generative Models
TMI, 2008. [pdf | bibtex]

Here is a poster and some slides related to this work.

Code

Executables for the BEL edge detection are available.

Behavior Recognition & Animal Behavior

This work originally had close ties to the Smart Vivarium, a project aiming to automate the monitoring of animal health and welfare. The specific problems we worked on included behavior recognition, tracking, abnormal activity detection, and large scale deployment. More recently we have continue applying our ideas through a number of collaborations at Caltech (see below).

Papers

Piotr Dollár, Vincent Rabaud, Garrison Cottrell and Serge Belongie
Behavior Recognition via Sparse Spatio-Temporal Features
ICCV VS-PETS 2005, Beijing, China. [pdf | bibtex]

Serge Belongie, Kristin Branson, Piotr Dollár, and Vincent Rabaud
Monitoring Animal Behavior in the Smart Vivarium
Measuring Behavior 2005, Wageningen, The Netherlands. [pdf | bibtex]

C. Hsu, P. Dollár, D. Chang, and A. Steele
Daily timed sexual interaction induces moderate anticipatory activity in mice
PLoS ONE 2010. [pdf | bibtex]

D. Lin, M. Boyle, P. Dollár, H. Lee, P. Perona, E. Lein, D. Anderson
Functional identification of an aggression locus in the mouse hypothalamus
Nature, 2011. [link | pdf | bibtex]

X.P. Burgos-Artizzu, P. Dollár, D. Lin, D.J. Anderson and P. Perona
Social Behavior Recognition in Continuous Videos
CVPR, 2012. [pdf | bibtex]

The following poster provides a good introduction to our ICCV05 work.

Caltech Resident-Intruder Mouse Dataset (CRIM13)

The Caltech Resident-Intruder Mouse dataset (CRIM13) consists of 237x2 videos (recorded with synchronized top and side view) of pairs of mice engaging in social behavior, catalogued into thirteen different actions. Each video lasts ~10min, for a total of 88 hours of video and 8 million frames. A team of behavior experts annotated each video on a frame-by-frame basis for a state-of-the-art study of the neurophysiological mechanisms involved in aggression and courtship in mice. The dataset is available at the Caltech Resident-Intruder Mouse dataset project website.

Mouse Behavior & Facial Expression Datasets (2005)

The datasets, as described in Dollár et. al 2005, are available for download as a number of zip files. The videos are encoded using the DivX codec. Please cite the ICCV 05 paper if you use this data in a publication.

Mouse Behavior [7 parts]: set00 | set01 | set02 | set03 | set04 | set05 | set06

Facial Expressions [4 parts]: set00 | set01 | set02 | set03

Code

If you are interested in obtaining our code please contact us. The documentation is available here, the code requires my basic toolbox, version 1.03. Note: the code is NOT compatible with the most recent version of my toolbox, you must use the older version.