June 6, 2024, 4:52 a.m. | Jeiyoon Park, Chanjun Park, Heuiseok Lim

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.03202v1 Announce Type: new
Abstract: We explore and improve the capabilities of LLMs to generate data for grammatical error correction (GEC). When merely producing parallel sentences, their patterns are too simplistic to be valuable as a corpus. To address this issue, we propose an automated framework that includes a Subject Selector, Grammar Selector, Prompt Manager, and Evaluator. Additionally, we introduce a new dataset for GEC tasks, named \textbf{ChatLang-8}, which encompasses eight types of subject nouns and 23 types of grammar. …

abstract arxiv automated capabilities cs.ai cs.cl data data generation error error correction explore framework gec generate issue llm llms patterns synthetic synthetic data type

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Senior Research Engineer/Specialist - Motor Mechanical Design

@ GKN Aerospace | Bristol, GB

Research Engineer (Motor Mechanical Design)

@ GKN Aerospace | Bristol, GB

Senior Research Engineer (Electromagnetic Design)

@ GKN Aerospace | Bristol, GB

Associate Research Engineer Clubs | Titleist

@ Acushnet Company | Carlsbad, CA, United States