May 7, 2024, 4:50 a.m. | Alvin Po-Chun Chen, Ray Groshan, Sean von Bayern

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.02517v1 Announce Type: new
Abstract: Extensive research exists on the performance of large language models on logic-based tasks, whereas relatively little has been done on their ability to generate creative solutions on lateral thinking tasks. The BrainTeaser shared task tests lateral thinking and uses adversarial datasets to prevent memorization, resulting in poor performance for out-of-the-box models. We propose a system for iterative, chain-of-thought prompt engineering which optimizes prompts using human evaluation. Using this shared task, we demonstrate our system's ability …

abstract adversarial arxiv creative cs.cl datasets generate iterative language language models large language large language models logic optimization performance prompt research solutions tasks tests thinking thought type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US