Feb. 9, 2024, 5:43 a.m. | Jonathan Thomm Aleksandar Terzic Geethan Karunaratne Giacomo Camposampiero Bernhard Sch\"olkopf Abbas Rahimi

cs.LG updates on arXiv.org arxiv.org

We analyze the capabilities of Transformer language models on learning discrete algorithms. To this end, we introduce two new tasks demanding the composition of several discrete sub-tasks. On both training LLaMA models from scratch and prompting on GPT-4 and Gemini we measure learning compositions of learned primitives. We observe that the compositional capabilities of state-of-the-art Transformer language models are very limited and sample-wise scale worse than relearning all sub-tasks for a new algorithmic composition. We also present a theorem in …

algorithms analyze capabilities cs.ai cs.cl cs.lg gemini gpt gpt-4 language language models llama llama models observe prompting tasks training transformer transformer language models

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US