Improving Mathematical Reasoning with Process Supervision | allainews.com

May 31, 2023, 7 a.m. |

OpenAI Blog openai.com

We've trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”). In addition to boosting performance relative to outcome supervision, process supervision also has an important alignment benefit: it directly trains the model to produce a chain-of-thought that is endorsed by humans.

alignment art benefit boosting performance process reasoning state supervision trains

More from openai.com / OpenAI Blog

We’re bringing the Financial Times’ world-class journalism to ChatGPT 3 weeks ago | openai.com

chatgpt class financial financial times +4

Introducing more enterprise-grade features for API customers 3 weeks, 6 days ago | openai.com

api assistants costs customers +6

OpenAI’s commitment to child safety: adopting safety by design principles 3 weeks, 6 days ago | openai.com

child children commitment companies +7

Introducing OpenAI Japan 1 month ago | openai.com

asia gpt gpt-4 japan +4

Introducing improvements to the fine-tuning API and expanding our custom models program 1 month, 2 weeks ago | openai.com

api build control custom models +6

Start using ChatGPT instantly 1 month, 2 weeks ago | openai.com

benefits benefits of ai chatgpt experience +2

Navigating the Challenges and Opportunities of Synthetic Voices 1 month, 3 weeks ago | openai.com

challenges opportunities scale small +4

Sora: First Impressions 1 month, 3 weeks ago | openai.com

community creative feedback impressions +1

Global news partnerships: Le Monde and Prisa Media 2 months, 1 week ago | openai.com

chatgpt french global international +4

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net