all AI news
Splitting GPT-J And Other Large Language Models Over Multiple GPUs
Jan. 6, 2022, 11:12 p.m. | /u/l33thaxman
Deep Learning www.reddit.com
As models get larger and larger, it becomes less feasible to load a model onto a single GPU. This video goes over using Parallelformers, a library that allows one to easily split GPT-J and other large models over multiple GPUs. This allows one with multiple 3060s 3080s 3080Tis etc to easily use this large model with CUDA acceleration when previously it was very difficult.
https://www.youtube.com/watch?v=CxCxXI2m2a0
submitted by /u/l33thaxman[link] [comments]
deeplearning gpt-j language language models large language models
More from www.reddit.com / Deep Learning
Any tips how to start DL?
1 day, 11 hours ago |
www.reddit.com
What amount of data makes up a tensor?
2 days, 20 hours ago |
www.reddit.com
how to utilize my time?
4 days, 6 hours ago |
www.reddit.com
Training an Small Language Model
4 days, 10 hours ago |
www.reddit.com
[Advice] Master in AI or Math (if you are bad at math)
4 days, 14 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York