March 9, 2024, 9:44 a.m. | /u/joelthomas-

Machine Learning www.reddit.com

Everyone is currently trying to make AI implementations fast/efficient (because more efficient -> less money spent on compute).

For instance, flash attention 2 is implemented in CUDA. Llama.cpp is C++

Is PyTorch enough? or is there an advantage learning CUDA/C++ in this market, especially for LLMs?

And if CUDA is useful in some cases, what are those cases?

attention compute cpp cuda flash instance llama llms machinelearning market money pytorch

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Software Engineer, Generative AI (C++)

@ SoundHound Inc. | Toronto, Canada