July 27, 2022, 1:11 a.m. | Amit Chaulwar, Lukas Malik, Maciej Krajewski, Felix Reichel, Leif-Nissen Lundbæk, Michael Huth, Bartlomiej Matejczyk

cs.CL updates on arXiv.org arxiv.org

Modern search systems use several large ranker models with transformer
architectures. These models require large computational resources and are not
suitable for usage on devices with limited computational resources. Knowledge
distillation is a popular compression technique that can reduce the resource
needs of such models, where a large teacher model transfers knowledge to a
small student model. To drastically reduce memory requirements and energy
consumption, we propose two extensions for a popular sentence-transformer
distillation procedure: generation of an optimal size …

arxiv battery compression devices edge edge devices inference lg life storage transformer

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote