May 1, 2024, 4:47 a.m. | James A. Michaelov, Catherine Arnett, Benjamin K. Bergen

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.19178v1 Announce Type: new
Abstract: Transformers have supplanted Recurrent Neural Networks as the dominant architecture for both natural language processing tasks and, despite criticisms of cognitive implausibility, for modelling the effect of predictability on online human language comprehension. However, two recently developed recurrent neural network architectures, RWKV and Mamba, appear to perform natural language tasks comparably to or better than transformers of equivalent scale. In this paper, we show that contemporary recurrent models are now also able to match - …

abstract architecture architectures arxiv cognitive cs.cl however human language language processing match metrics modelling natural natural language natural language processing network networks neural network neural networks processing recurrent neural network recurrent neural networks tasks transformers type

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York