UCSD Computer Vision

Kai Wang, PhD

More info at:


Kai grew up in Seattle and is widely regarded as the best Arnold Schwarzenegger voice impersonator on Earth.



GrOCR is an ongoing research project for word recognition in unconstrained images. The name is derived from the original impetus of the project, OCR for reading text on products found in grocery stores. While the focus of the project has moved beyond just that domain, the name has remained the same.

Distributed Human Computation

The ground truth labeling of an image dataset is a task that often requires a large amount of human time and labor. We present an infrastructure for distributed human labeling that can exploit the modularity of common vision problems involving segmentation and recognition. We present the different e...

Assistive Technology for the Visually Impaired

The contemporary urban environment is brimming with rich visual cues that provide valuable directional and informational content to sighted individuals. The goal of the GroZi project is to make significant advances toward making these visual cues universally accessible in a variety of real-world do...


Nguyen P., Wang K., Belongie S., "Video Text Detection and Recognition: Dataset and Benchmark", Winter Conference on Applications of Computer Vision (WACV), Steamboat Springs, CO, March, 2014. [BibTex][pdf]
Wang K., Kim E., Carlini N., Motyashov I., Nguyen D., Wagner D., "Operator-Assisted Tabulation of Optical Scan Ballots", Electronic Voting Technology Workshop/ Workshop on Trustworthy Elections (EVT/WOTE), Bellevue, WA, USENIX/ACCURATE/IAVoSS, August, 2012. [BibTex][pdf]
Wang K., Babenko B., Belongie S., "End-to-End Scene Text Recognition", IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, 2011. [BibTex][pdf]
Wang K., Belongie S., "Word Spotting in the Wild", European Conference on Computer Vision (ECCV), Heraklion, Crete, Sept., 2010. [www] [BibTex][pdf]
Wang K., Rescorla E., Shacham H., Belongie S., "OpenScan: A Fully Transparent Optical Scan Voting System", Electronic Voting Technology Workshop/ Workshop on Trustworthy Elections (EVT/WOTE). USENIX/ACCURATE/IAVoSS, Washington, DC, August, 2010. [BibTex][pdf]
Faymonville P., Wang K., Miller J., Belongie S., "CAPTCHA-based Image Labeling on the Soylent Grid", Human Computation Workshop (HCOMP), Paris, France, 2009. [www] [BibTex][pdf, (extended) pdf]
Laxton B., Wang K., Savage S., "Reconsidering Physical Key Secrecy: Teleduplication via Optical Decoding", ACM Conference on Computer and Communications Security (CCS), Alexandria, VA, October, 2008. [BibTex][pdf]