Web: http://arxiv.org/abs/2202.13403

May 6, 2022, 1:11 a.m. | Gerald Schwiebert, Cornelius Weber, Leyuan Qu, Henrique Siqueira, Stefan Wermter

cs.CL updates on arXiv.org arxiv.org

Large datasets as required for deep learning of lip reading do not exist in
many languages. In this paper we present the dataset GLips (German Lips)
consisting of 250,000 publicly available videos of the faces of speakers of the
Hessian Parliament, which was processed for word-level lip reading using an
automatic pipeline. The format is similar to that of the English language LRW
(Lip Reading in the Wild) dataset, with each video encoding one word of
interest in a context …

arxiv cv dataset learning lip reading multimodal reading systems transfer transfer learning

