This is the SVT result gallery from our paper, "End-to-End Scene Text Recognition." Dashed pink boxes are the ground truth labels and the green boxes are from our method. Note that only a subset of the words in the image are labeled in this recognition task. In the SVT dataset, the goal is to identify only words from local businesses, as specified by a word list provided with each image. Check out our paper for details!