all AI news
[D] Is Mamba scalabe as Transformer? or just another efficient model?
Feb. 1, 2024, 5:17 a.m. | /u/Dry_Cheesecake_8311
Machine Learning www.reddit.com
The author of Mamba claim ' Mamba-3B model outperforms Transformers of the same size and matches Transformers twice its size '.
How about some model like Mamba-13B (just an assumption) vs Mixtral 8x7B with large pre-training data? Has anyone experimented with this?
13b author claim data machinelearning mamba mixtral mixtral 8x7b pre-training scalable training training data transformer transformers
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US