Feb. 27, 2024, 5:51 a.m. | Yintao Tai, Xiyang Liao, Alessandro Suglia, Antonio Vergari

cs.CL updates on arXiv.org arxiv.org

arXiv:2401.03321v2 Announce Type: replace
Abstract: Recent work showed the possibility of building open-vocabulary large language models (LLMs) that directly operate on pixel representations. These models are implemented as autoencoders that reconstruct masked patches of rendered text. However, these pixel-based LLMs are limited to discriminative tasks (e.g., classification) and, similar to BERT, cannot be used to generate text. Therefore, they cannot be used for generative tasks such as free-form question answering. In this work, we introduce PIXAR, the first pixel-based autoregressive …

abstract arxiv auto autoencoders bert building classification cs.cl language language models large language large language models llms modeling pixar pixel possibility space tasks text type work

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Aumni - Site Reliability Engineer III - MLOPS

@ JPMorgan Chase & Co. | Salt Lake City, UT, United States

Senior Data Analyst

@ Teya | Budapest, Hungary

Technical Analyst (Data Analytics)

@ Contact Government Services | Chicago, IL

Engineer, AI/Machine Learning

@ Masimo | Irvine, CA, United States

Private Bank - Executive Director: Data Science and Client / Business Intelligence

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India