all AI news
Spectraformer: A Unified Random Feature Framework for Transformer
May 27, 2024, 4:42 a.m. | Duke Nguyen, Aditya Joshi, Flora Salim
cs.LG updates on arXiv.org arxiv.org
Abstract: Linearization of attention using various kernel approximation and kernel learning techniques has shown promise. Past methods use a subset of combinations of component functions and weight matrices within the random features paradigm. We identify the need for a systematic comparison of different combinations of weight matrix and component functions for attention learning in Transformer. In this work, we introduce Spectraformer, a unified framework for approximating and learning the kernel function in linearized attention of the …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Data Engineer
@ Displate | Warsaw
Lead Python Developer - Generative AI
@ S&P Global | US - TX - VIRTUAL
Analytics Engineer - Design Experience
@ Canva | Sydney, Australia
Data Architect
@ Unisys | Bengaluru - RGA Tech Park
Data Architect
@ HP | PSR01 - Bengaluru, Pritech Park- SEZ (PSR01)
Streetlight Analyst
@ DTE Energy | Belleville, MI, US