all AI news
TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings. (arXiv:2304.01433v3 [cs.AR] UPDATED)
cs.LG updates on arXiv.org arxiv.org
In response to innovations in machine learning (ML) models, production
workloads changed radically and rapidly. TPU v4 is the fifth Google domain
specific architecture (DSA) and its third supercomputer for such ML models.
Optical circuit switches (OCSes) dynamically reconfigure its interconnect
topology to improve scale, availability, utilization, modularity, deployment,
security, power, and performance; users can pick a twisted 3D torus topology if
desired. Much cheaper, lower power, and faster than Infiniband, OCSes and
underlying optical components are <5% of system …
architecture arxiv components cost deployment dsa embeddings google hardware infiniband innovations machine machine learning ml models performance power production scale security supercomputer support topology tpu