all AI news
[D][P] Adding FlashAttention to any HuggingFace model
April 30, 2023, 10:30 a.m. | /u/supreethrao
Machine Learning www.reddit.com
I've wanted to add flash attention to models on huggingface (particularly the LLaMA variants) is there a guide/playbook on going about adding different attention mechanisms to existing models? In the grander scheme of this I would like to build this out as a library where you pass in a model and it gives out the model with a different attention mechanism. Would this be of use since PyTorch 2.0 already supports flash attention.
​
Thanks !
attention attention mechanisms guide huggingface library llama machinelearning variants
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Software Engineer, Generative AI (C++)
@ SoundHound Inc. | Toronto, Canada