all AI news
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
March 20, 2024, 4:41 a.m. | Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen, Jun Zhu
cs.LG updates on arXiv.org arxiv.org
Abstract: Pretraining transformers are generally time-consuming. Fully quantized training (FQT) is a promising approach to speed up pretraining. However, most FQT methods adopt a quantize-compute-dequantize procedure, which often leads to suboptimal speedup and significant performance degradation when used in transformers due to the high memory access overheads and low-precision computations. In this work, we propose Jetfire, an efficient and accurate INT8 training method specific to transformers. Our method features an INT8 data flow to optimize memory …
abstract arxiv block compute cs.lg data data flow flow however leads per performance pretraining quantization speed training transformer transformers type
More from arxiv.org / cs.LG updates on arXiv.org
Sliced Wasserstein with Random-Path Projecting Directions
1 day, 16 hours ago |
arxiv.org
Learning Extrinsic Dexterity with Parameterized Manipulation Primitives
1 day, 16 hours ago |
arxiv.org
The Un-Kidnappable Robot: Acoustic Localization of Sneaking People
1 day, 16 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York