The distribution contains the object detection and model learning
code.
It also contains models trained on the PASCAL datasets and
the INRIA Person dataset.
The system is implemented in Matlab, with a few helper functions
written in C/C++ for efficiency reasons.
The software was tested on
several versions of Linux and Mac OS X.
To download, click here: voc-release3.tgz (updated on 06/08/09)
This project is supported by the National Science Foundation under Grant No. 0534820 and 0746569.
[1] P. Felzenszwalb, D. McAllester, D. Ramanan
A Discriminatively Trained, Multiscale, Deformable Part Model
Proceedings of the IEEE CVPR 2008
[2] P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
Object Detection with Discriminatively Trained Part Based Models
DRAFT
pdf
Slides from a recent talk about the system pdf
The models included with the source code were trained on the train+val
dataset from each year and evaluated on the corresponding test
dataset.
This is exactly the protocol of the "comp3" competition.
Below are the average precision scores we obtain in each category.
| 2006 data | bicycle | bus | car | cat | cow | dog | horse | mbike | person | sheep |
| without context | 0.620 | 0.493 | 0.635 | 0.190 | 0.417 | 0.153 | 0.386 | 0.579 | 0.380 | 0.402 |
| with context | 0.623 | 0.502 | 0.631 | 0.236 | 0.437 | 0.185 | 0.429 | 0.625 | 0.401 | 0.431 |
| 2007 data | aero | bicycle | bird | boat | bottle | bus | car | cat | chair | cow | table | dog | horse | mbike | person | plant | sheep | sofa | train | tv |
| without context | 0.287 | 0.551 | 0.006 | 0.145 | 0.265 | 0.397 | 0.502 | 0.163 | 0.165 | 0.166 | 0.245 | 0.050 | 0.452 | 0.383 | 0.362 | 0.090 | 0.174 | 0.228 | 0.341 | 0.384 |
| with context | 0.328 | 0.568 | 0.025 | 0.168 | 0.285 | 0.397 | 0.516 | 0.213 | 0.179 | 0.185 | 0.259 | 0.088 | 0.492 | 0.412 | 0.368 | 0.146 | 0.162 | 0.244 | 0.392 | 0.391 |
We also trained and tested a model on the INRIA Person dataset.
We
scored the model using the PASCAL evaluation methodology in the
complete test dataset, including images without people
INRIA Person average precision: 0.869