all AI news
P$^3$LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training. (arXiv:2210.12339v1 [cs.CL])
Oct. 25, 2022, 1:17 a.m. | Junwei Bao, Yifan Wang, Jiangyong Ying, Yeyun Gong, Jing Zhao, Youzheng Wu, Xiaodong He
cs.CL updates on arXiv.org arxiv.org
Conventional autoregressive left-to-right (L2R) sequence generation faces two
issues during decoding: limited to unidirectional target sequence modeling, and
constrained on strong local dependencies. To address the aforementioned
problem, we propose P$^3$LM, a probabilistically permuted prophet language
model, which strengthens the modeling of bidirectional information and long
token dependencies for sequence generation. Specifically, P$^3$LM learns to
generate tokens in permuted order upon an order-aware transformer decoder, as
well as to generate the corresponding future $N$ tokens with a multi-stream
attention mechanism. …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote