all AI news
[P] CTranslate2: an efficient inference engine for Transformer models
May 23, 2022, 2:37 p.m. | /u/guillaumekln
Machine Learning www.reddit.com
I'd like to share this project I've been working on for almost 4 years:
[https://github.com/OpenNMT/CTranslate2](https://github.com/OpenNMT/CTranslate2)
CTranslate2 is a C++ and Python library for efficient inference with Transformer models. While the project initially focused on translation models (hence the name), it also supports autoregressive language models such as GPT-2 and the recent OPT models from Meta.
The library comes with a highly optimized runtime that implements various performance optimization techniques such as weight quantization, layer fusion, batch reordering, padding removal, …
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Senior AI Engineer, EdTech (Remote)
@ Lightci | Toronto, Ontario
Data Scientist for Salesforce Applications
@ ManTech | 781G - Customer Site,San Antonio,TX
AI Research Scientist
@ Gridmatic | Cupertino, CA
Data Engineer
@ Global Atlantic Financial Group | Boston, Massachusetts, United States
Machine Learning Engineer - Conversation AI
@ DoorDash | Sunnyvale, CA; San Francisco, CA; Seattle, WA; Los Angeles, CA