all AI news
Mothman at SemEval-2024 Task 9: An Iterative System for Chain-of-Thought Prompt Optimization
May 7, 2024, 4:50 a.m. | Alvin Po-Chun Chen, Ray Groshan, Sean von Bayern
cs.CL updates on arXiv.org arxiv.org
Abstract: Extensive research exists on the performance of large language models on logic-based tasks, whereas relatively little has been done on their ability to generate creative solutions on lateral thinking tasks. The BrainTeaser shared task tests lateral thinking and uses adversarial datasets to prevent memorization, resulting in poor performance for out-of-the-box models. We propose a system for iterative, chain-of-thought prompt engineering which optimizes prompts using human evaluation. Using this shared task, we demonstrate our system's ability …
abstract adversarial arxiv creative cs.cl datasets generate iterative language language models large language large language models logic optimization performance prompt research solutions tasks tests thinking thought type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US