March 30, 2024, 12:50 p.m. | /u/necrashter

Machine Learning www.reddit.com

I've created a simple script that trains a [Mamba](https://arxiv.org/abs/2312.00752) model for character-level language modeling. It can be considered the Mamba version of the popular [char-rnn](https://github.com/karpathy/char-rnn) repository.

[GitHub Repository](https://github.com/necrashter/char-mamba)

Any plain text file can be used as a dataset. By default, it will automatically download and use the [Tiny Shakespeare dataset](https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt).

Since the code is quite simple, it can be also used as a **template for training Mamba models from scratch**, applicable to a wide array of sequence-to-sequence problems.

I hope …

array code machinelearning mamba scratch simple template training will

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote