May 18, 2023, 8:41 a.m. | /u/Greedy-Cupcake-3694

Deep Learning www.reddit.com

I wrote a tutorial to improve GPT completion throughput with dynamic batching [https://microsoft.github.io/batch-inference/examples/gpt\_completion.html](https://microsoft.github.io/batch-inference/examples/gpt_completion.html). And I can achieve 16 times throughput on V100 comparing to baseline. We built a python dynamic batching library so you can apply it on your own models easily [https://github.com/microsoft/batch-inference](https://github.com/microsoft/batch-inference).


Although the tutorial we built for GPT shows promising result on throughput, it doesn't use complex decoding algorithms like top-p or beam search, and we are aware of more advanced batching algorithms for GPT completion. So we're …

advanced algorithms batching building decoding deeplearning future gpt inference library production search shows tutorial

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Machine Learning Engineer

@ Samsara | Canada - Remote