all AI news
QONNX: Representing Arbitrary-Precision Quantized Neural Networks. (arXiv:2206.07527v3 [cs.LG] UPDATED)
June 27, 2022, 1:11 a.m. | Alessandro Pappalardo, Yaman Umuroglu, Michaela Blott, Jovan Mitrevski, Ben Hawks, Nhan Tran, Vladimir Loncar, Sioni Summers, Hendrik Borras, Jules Mu
cs.LG updates on arXiv.org arxiv.org
We present extensions to the Open Neural Network Exchange (ONNX) intermediate
representation format to represent arbitrary-precision quantized neural
networks. We first introduce support for low precision quantization in existing
ONNX-based quantization formats by leveraging integer clipping, resulting in
two new backward-compatible variants: the quantized operator format with
clipping and quantize-clip-dequantize (QCDQ) format. We then introduce a novel
higher-level ONNX format called quantized ONNX (QONNX) that introduces three
new operators -- Quant, BipolarQuant, and Trunc -- in order to represent
uniform …
More from arxiv.org / cs.LG updates on arXiv.org
Regularization by Texts for Latent Diffusion Inverse Solvers
1 day, 5 hours ago |
arxiv.org
When can transformers reason with abstract symbols?
1 day, 5 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
Data Scientist 3
@ Wyetech | Annapolis Junction, Maryland
Technical Program Manager, Robotics
@ DeepMind | Mountain View, California, US
Machine Learning Engineer
@ Issuu | Braga
Business Intelligence Manager
@ Intuitive | Bengaluru, India
Expert Data Engineer (m/w/d)
@ REWE International Dienstleistungsgesellschaft m.b.H | Wien, Austria