April 18, 2024, 4:46 a.m. | Letian Peng, Jingbo Shang

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.10877v1 Announce Type: new
Abstract: In this paper, we aim to generate text classification data given arbitrary class definitions (i.e., user instruction), so one can train a small text classifier without any human annotation or raw corpus. Compared with pioneer attempts, our proposed Incubator is the first framework that can handle complicated and even mutually dependent classes (e.g., "TED Talk given by Educator" and "Other"). Specifically, Incubator is an LLM firstly tuned on the instruction-to-data mappings that we obtained from …

abstract aim annotation arxiv class classification classifier classifiers cs.cl data definitions framework generate human incubating llm nothing paper raw small text text classification train type

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York