Jan. 17, 2022, 2:10 a.m. | Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

cs.CL updates on arXiv.org arxiv.org

Neural models trained with large amount of parallel data have achieved
impressive performance in abstractive summarization tasks. However, large-scale
parallel corpora are expensive and challenging to construct. In this work, we
introduce a low-cost and effective strategy, ExtraPhrase, to augment training
data for abstractive summarization tasks. ExtraPhrase constructs pseudo
training data in two steps: extractive summarization and paraphrasing. We
extract major parts of an input text in the extractive summarization step, and
obtain its diverse expressions with the paraphrasing step. …

arxiv augmentation data

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote