Synopsis
The goal of the Honda/UCSD Video Database is to provide a standard video database for evaluting face tracking/recognition algorithms. Each video sequence is recorded in an indoor environment at 15 frames per second, and each lasted for at least 15 seconds. The resolution of each video sequence is 640x480. Every individual is recorded in at least two video sequences. Since we believe that pose variation provides the greatest challenge to recognition, all the video sequences contain significant 2-D (in-plane) and 3-D (out-of-plane) head rotations. In each video, the person rotates and turns his/her head in his/her own preferred order and speed, and typically in about 15 seconds, the individual is able to provide a wide range of different poses. In addition, some of these sequences contain difficult events which a real-world tracker/recognizer would likely encounter, such as partial occlusion, face partly leaving the field of view, and large scale changes, etc.