April 8, 2024, 8:36 p.m. | /u/gokulPRO

Deep Learning www.reddit.com

I am planning on working on large multiomodal training (1B parameters) for text+audio. As of now I was thinking of going with pytorch, deepspeed, wandb. What do you recommend and what do you use in general for distributed large model training?

Do you use hugginface? I felt it a bit too wrapped that it becomes messy to access the bare backbones, but haven't given it a proper try. For out of shelf models and custom dataset training that does sound …

audio deeplearning deepspeed distributed felt general parameters planning pytorch research stack tech tech stack text thinking training wandb

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US