all AI news
Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation
Feb. 28, 2024, 5:44 a.m. | Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum
cs.LG updates on arXiv.org arxiv.org
Abstract: We study semi-supervised sequence generation tasks, where the few labeled examples are too scarce to finetune a model, and meanwhile, few-shot prompted large language models (LLMs) exhibit room for improvement. In this paper, we present the discovery that a student model distilled from a few-shot prompted LLM can commonly generalize better than its teacher to unseen examples on such tasks. We find that the student is able to learn a general pattern from the high-quality …
abstract arxiv collaborative cs.cl cs.lg discovery distillation examples few-shot improvement knowledge language language model language models large language large language model large language models llms paper room semi-supervised study tasks type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Intern - Robotics Industrial Engineer Summer 2024
@ Vitesco Technologies | Seguin, US