all AI news
Half-precision Inference Doubles On-Device Inference Performance
The TensorFlow Blog blog.tensorflow.org
Posted by Marat Dukhan and Frank Barchard, Software Engineers
CPUs deliver the widest reach for ML inference and remain the default target for TensorFlow Lite. Consequently, improving CPU inference performance is a top priority, and we are excited to announce that we doubled floating-point inference performance in TensorFlow Lite’s XNNPack backend by enabling half-precision inference on ARM CPUs. This means that more AI powered features may be deployed to older and lower tier devices.
Traditionally, TensorFlow Lite supported two kinds …
ai announcement backend cpu cpus enabling engineers how-to inference learn ml inference performance precision software software engineers tensorflow tensorflow-lite