[P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels | allainews.com

Oct. 26, 2022, 6:10 a.m. | /u/pommedeterresautee

Machine Learning www.reddit.com

We are releasing [Kernl](https://github.com/ELS-RD/kernl/) under Apache 2 license, a library to make PyTorch models inference significantly faster. With 1 line of code we applied the optimizations and made Bert up to 12X faster than Hugging Face baseline. T5 is also covered in this first release (> 6X speed up generation and we are still halfway in the optimizations!). This has been possible because we wrote custom GPU kernels with the new OpenAI programming language Triton and leveraged TorchDynamo.

**Project link**: …

bert gpu inference machinelearning openai transformers triton

More from www.reddit.com / Machine Learning

[R] A Primer on the Inner Workings of Transformer-based Language Models 5 hours ago | www.reddit.com

abstract advanced authors insights +9

[D] Fine-tune Phi-3 model for domain specific data - seeking advice and insights 8 hours ago | www.reddit.com

accuracy advice benchmark data +11

[Discussion] Should I go to ICML and present my paper? 17 hours ago | www.reddit.com

academia data data scientist future +10

[P] Panza: A personal email assistant, trained and running on-device 18 hours ago | www.reddit.com

assistant automated email emails +9

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? 19 hours ago | www.reddit.com

70b a100 budget five +9

[D] Something I always think about, for top conferences like ICML, NeurIPS, CVPR,..etc. How many … 20 hours ago | www.reddit.com

conferences cvpr etc good +8

[D] Benchmark creators should release their benchmark datasets in stages 21 hours ago | www.reddit.com

benchmark benchmarks concerns data +11

[P] spRAG - Open-source RAG implementation for challenging real-world tasks 22 hours ago | www.reddit.com

core hey implementation machinelearning +7

[D] Paper accepted to ICML but not attending in person? 1 day, 1 hour ago | www.reddit.com

authors conference icml machinelearning +6

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Machine Learning Engineer

@ Samsara | Canada - Remote

View on ai-jobs.net