June 28, 2024, 4:42 a.m. | Trung Quoc Luong, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran Jin, Hang Li

cs.CL updates on arXiv.org arxiv.org

arXiv:2401.08967v2 Announce Type: replace
Abstract: One way to enhance the reasoning capability of Large Language Models (LLMs) is to conduct Supervised Fine-Tuning (SFT) using Chain-of-Thought (CoT) annotations. This approach does not show sufficiently strong generalization ability, however, because the training only relies on the given CoT data. In math problem-solving, for example, there is usually only one annotated reasoning path for each question in the training data. Intuitively, it would be better for the algorithm to learn from multiple annotated …

abstract annotations arxiv capability cs.cl data example fine-tuning however language language models large language large language models llms math problem problem-solving reasoning replace sft show supervised fine-tuning thought training tuning type

Data Scientist

@ Ford Motor Company | Chennai, Tamil Nadu, India

Systems Software Engineer, Graphics

@ Parallelz | Vancouver, British Columbia, Canada - Remote

Engineering Manager - Geo Engineering Team (F/H/X)

@ AVIV Group | Paris, France

Data Analyst

@ Microsoft | San Antonio, Texas, United States

Azure Data Engineer

@ TechVedika | Hyderabad, India

Senior Data & AI Threat Detection Researcher (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel