Feb. 8, 2024, 5:43 a.m. | Ouail Kitouni Niklas Nolte James Hensman Bhaskar Mitra

cs.LG updates on arXiv.org arxiv.org

Structured (dictionary-like) data presents challenges for left-to-right language models, as they can struggle with structured entities for a wide variety of reasons such as formatting and sensitivity to the order in which attributes are presented. Tabular generative models suffer from a different set of limitations such as their lack of flexibility. We introduce Diffusion Models of Structured Knowledge (DiSK) - a new architecture and training approach specialized for structured data. DiSK handles text, categorical, and continuous numerical data using a …

challenges cs.ai cs.lg data dictionary diffusion diffusion model flexibility generative generative models knowledge language language models limitations sensitivity set struggle tabular

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne