Dec. 6, 2023, 4:13 p.m. | /u/thefreemanever

Deep Learning

Considering we have an LLM model sized 48GB, can we use 2x 24GB or 3x16GB GPUs (With no NVLink) to run the model? (I mean model inference by run.)

deeplearning gpus inference llm mean multiple nvlink small

