Object Detection


P. Dollár, Z. Tu, P. Perona and S. Belongie
Integral Channel Features
BMVC 2009, London, England. [pdf | poster | abstract | bibtex | addendum]

P. Dollár, S. Belongie and P. Perona
The Fastest Pedestrian Detector in the West
BMVC 2010, Aberystwyth, UK. [pdf | poster | abstract | bibtex]

P. Dollár, R. Appel and W. Kienzle
Crosstalk Cascades for Frame-Rate Pedestrian Detection
ECCV 2012, Florence, Italy. [pdf | poster | bibtex]

D. Park, C. Zitnick, D. Ramanan and P. Dollár
Exploring Weak Stabilization for Motion Feature Extraction
CVPR 2013, Portland, Oregon. [pdf | bibtex]

P. Dollár, R. Appel, S. Belongie and P. Perona
Fast Feature Pyramids for Object Detection
PAMI 2014. [pdf | bibtex]

C. L. Zitnick and P. Dollár
Edge Boxes: Locating Object Proposals from Edges
ECCV 2014, Zurich. [pdf | bibtex | slides | code]

Full benchmark results can be found at the Caltech Pedestrian Dataset.
I also gave a talk on object-centered visual recognition at MSR.


Highly optimized code for pedestrian detection is now available as part of my Matlab toolbox (see the channels/ and detector/ directories). The pre-trained pedestrian detector runs at ~30 fps on VGA images and gives state of the art results.

Pose Estimation


P. Dollár, P. Welinder and P. Perona
Cascaded Pose Regression
CVPR 2010, San Francisco, CA. [pdf | poster | bibtex]

X. P. Burgos-Artizzu, D. Hall, P. Perona and P. Dollár
Merging Pose Estimates Across Space and Time
BMVC 2013, Bristol, UK. [pdf | abstract | appendix | bibtex]

X. P. Burgos-Artizzu, P. Perona and P. Dollár
Robust Face Landmark Estimation Under Occlusion
ICCV 2013, Sydney, Australia. [pdf | appendix | bibtex]

B. Hariharan, C. Zitnick, P. Dollár
Detecting Objects using Deformation Dictionaries
CVPR 2014, Columbus, Ohio. [pdf | poster | spotlight | bibtex]


CPR Matlab code is now available. The CPR code requires my basic toolbox. Updated Aug. 06, 2012, see readme.

Code and data for our BMVC 2013 paper on meging pose estimates across space and time is available on the project website.

Caltech Pedestrian Benchmark

peds01 peds02 peds04


P. Dollár, C. Wojek, B. Schiele and P. Perona
Pedestrian Detection: A Benchmark
CVPR 2009, Miami, Florida. [pdf | bibtex]

P. Dollár, C. Wojek, B. Schiele and P. Perona
Pedestrian Detection: An Evaluation of the State of the Art
PAMI 2012. [pdf | bibtex]

I gave a talk on pedestrian detection at MSR in June 2010.


The dataset, evaluation code and up-to-date results can be found on the Caltech Pedestrian Detection Benchmark project website.

Multiple Instance & Multiple Component Learning


P. Dollár, B. Babenko, S. Belongie, P. Perona and Z. Tu
Multiple Component Learning for Object Detection
ECCV 2008, Marseille, France. [pdf | bibtex]

B. Babenko, P. Dollár, Z. Tu and S. Belongie
Simultaneous Learning and Alignment: Multi-Instance and Multi-Pose Learning
ECCV 2008: Faces in Real-Life Images, Marseille, France. [pdf | bibtex]

B. Babenko, P. Dollár and S. Belongie
Multiple Instance Learning with Query Bags
UCSD-TR 2009, CS2009-0949. [pdf | bibtex]

B. Babenko, N. Verma, P. Dollár and S. Belongie
Multiple Instance Learning with Manifold Bags
ICML 2011, Bellevue, Washington. [pdf | bibtex]

R. Appel, T. Fuchs, P. Dollár and P. Perona
Quickly Boosting Decision Trees – Pruning Underachieving Features Early
ICML 2013, Atlanta, GA. [pdf | bibtex]

Also here is the poster for MCL (ECCV08) and also for the associated workshop paper (ECCV08-WK), which should serve the role of an in depth reference to Multiple Instance Learning (MIL).

Non-Isometric Manifold Learning


Piotr Dollár, Vincent Rabaud and Serge Belongie
Learning to Traverse Image Manifolds
NIPS 2006, Vancouver, B.C., Canada. [pdf | bibtex]

Longer version of NIPS work [extra 4 page appendix]:
Piotr Dollár, Vincent Rabaud and Serge Belongie
Learning to Traverse Image Manifolds
UCSD-TR 2007, CS2007-0876. [pdf | bibtex]

Piotr Dollár, Vincent Rabaud and Serge Belongie
Non-Isometric Manifold Learning: Analysis and an Algorithm
ICML 2007, Corvallis, Oregon. [pdf | bibtex]

D.S. Touretzky, A.S. Gupta, M.C. Fuhs, P. Dollár, A.P. Maurer, B.L. McNaughton
Reconstructing the Topologies of Hippocampal Cognitive Maps
SFN 2007, San Diego, CA. [abstract]

The following posters (NIPS06 and ICML07) provide a good introduction to our work. Also, here are the slides from our ICML talk. And here is a video of the talk itself.


LSML Matlab code is now available. The LSML code requires my basic toolbox. Updated as of Mar. 06, 2009, see readme.

Boundary & Feature Learning

The goal is simple: to learn edges and object boundaries from human labeled images while making few modeling assumptions. Some example training and testing images are given for a number of domains (click each icon to see enlarged corresponding images). We've extended these ideas to other domains including feature learning and brain segmentation.


Piotr Dollár, Zhuowen Tu and Serge Belongie
Supervised Learning of Edges and Object Boundaries
CVPR 2006, New York, New York. [pdf | bibtex | poster | slides]

Piotr Dollár, Zhuowen Tu, Hai Tao and Serge Belongie
Feature Mining for Image Classification
CVPR 2007, Minneapolis, Minnesota. [pdf | bibtex]

Boris Babenko, Piotr Dollár and Serge Belongie
Task Specific Local Region Matching
ICCV 2007, Rio de Janeiro, Brazil. [pdf | bibtex]

Z. Tu, K.L. Narr, P. Dollár, I. Dinov, P.M. Thompson and A.W. Toga
Brain Anatomical Structure Segmentation by Hybrid Discriminative/Generative Models
TMI 2008. [pdf | bibtex]

J. Lim, C. Zitnick and P. Dollár
Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection
CVPR 2013, Portland, Oregon. [pdf | bibtex]

P. Dollár and C. Zitnick
Structured Forests for Fast Edge Detection
ICCV 2013, Sydney, Australia. [pdf | bibtex | appendix | talk | slides]

P. Dollár and C. Zitnick
Fast Edge Detection Using Structured Forests
arXiv 2014. [pdf]


Full source code for our ICCV 2013 and arXiv 2014 Structured Edge Detector is now available (version 3.0). The detector is very fast and achieves top accuracy on the BSDS500 Segmentation dataset. It can be used as input to any algorithm requiring high quality edge maps, give it a try. Additional models including NYUD depth models as well as a higher accuracy BSDS model are also available. Full source code for Sketch Tokens and the associated boundary detector is likewise available (the structured edge detector is superior for edge detection though). Finally, executables for our older BEL edge detector are also available but these are now obsolete.

Behavior Recognition & Animal Behavior

This work originally had close ties to the Smart Vivarium, a project aiming to automate the monitoring of animal health and welfare. The specific problems we worked on included behavior recognition, tracking, abnormal activity detection, and large scale deployment. More recently we have continue applying our ideas through a number of collaborations at Caltech (see below).


Piotr Dollár, Vincent Rabaud, Garrison Cottrell and Serge Belongie
Behavior Recognition via Sparse Spatio-Temporal Features
ICCV VS-PETS 2005, Beijing, China. [pdf | bibtex]

Serge Belongie, Kristin Branson, Piotr Dollár and Vincent Rabaud
Monitoring Animal Behavior in the Smart Vivarium
Measuring Behavior 2005, Wageningen, The Netherlands. [pdf | bibtex | poster]

C. Hsu, P. Dollár, D. Chang and A. Steele
Daily timed sexual interaction induces moderate anticipatory activity in mice
PLoS ONE 2010. [pdf | bibtex]

D. Lin, M. Boyle, P. Dollár, H. Lee, P. Perona, E. Lein and D. Anderson
Functional identification of an aggression locus in the mouse hypothalamus
Nature 2011. [link | pdf | bibtex]

X.P. Burgos-Artizzu, P. Dollár, D. Lin, D.J. Anderson and P. Perona
Social Behavior Recognition in Continuous Videos
CVPR 2012. [pdf | bibtex]

A. Falkner, P. Dollár, P. Perona, D. Anderson, and D. Lin
Decoding ventromedial hypothalamic neural activity during male mouse aggression
Neuroscience 2014. [link | pdf | bibtex]

Caltech Resident-Intruder Mouse Dataset (CRIM13)

The Caltech Resident-Intruder Mouse dataset (CRIM13) consists of 237x2 videos (recorded with synchronized top and side view) of pairs of mice engaging in social behavior, catalogued into thirteen different actions. Each video lasts ~10min, for a total of 88 hours of video and 8 million frames. A team of behavior experts annotated each video on a frame-by-frame basis for a state-of-the-art study of the neurophysiological mechanisms involved in aggression and courtship in mice. The dataset is available at the Caltech Resident-Intruder Mouse dataset project website.

Mouse Behavior & Facial Expression Datasets (2005)

The datasets, as described in Dollár et. al 2005, are available for download as a number of zip files. The videos are encoded using the DivX codec. Please cite the ICCV 05 paper if you use this data in a publication.

Mouse Behavior [7 parts]: set00 | set01 | set02 | set03 | set04 | set05 | set06

Facial Expressions [4 parts]: set00 | set01 | set02 | set03


Cuboids code is now available. The documentation is available here, the code requires my basic toolbox, version 1.03. Note: the code is NOT compatible with the most recent version of my toolbox, you must use the older version.