all AI news
Spectraformer: A Unified Random Feature Framework for Transformer
May 27, 2024, 4:42 a.m. | Duke Nguyen, Aditya Joshi, Flora Salim
cs.LG updates on arXiv.org arxiv.org
Abstract: Linearization of attention using various kernel approximation and kernel learning techniques has shown promise. Past methods use a subset of combinations of component functions and weight matrices within the random features paradigm. We identify the need for a systematic comparison of different combinations of weight matrix and component functions for attention learning in Transformer. In this work, we introduce Spectraformer, a unified framework for approximating and learning the kernel function in linearized attention of the …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Focused Biochemistry Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Senior Data Engineer
@ Displate | Warsaw
Staff Software Engineer (Data Platform)
@ Phaidra | Remote
Distributed Compute Engineer
@ Magic | San Francisco
Power Platform Developer/Consultant
@ Euromonitor | Bengaluru, Karnataka, India
Finance Project Senior Manager
@ QIMA | London, United Kingdom