Feb. 13, 2024, 5:44 a.m. | Jiacheng Ye Shansan Gong Liheng Chen Lin Zheng Jiahui Gao Han Shi Chuan Wu Zhenguo Li

cs.LG updates on arXiv.org arxiv.org

Diffusion models have gained attention in text processing, offering many potential advantages over traditional autoregressive models. This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models. We propose Diffusion-of-Thought (DoT), allowing reasoning steps to diffuse over time through the diffusion process. In contrast to traditional autoregressive language models that make decisions in a left-to-right, token-by-token manner, DoT offers more flexibility in the trade-off between computation and reasoning …

advantages attention autoregressive models cs.ai cs.cl cs.lg diffusion diffusion models integration language language models processing reasoning text thought thoughts through work

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne