all AI news
Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input
March 26, 2024, 4:49 a.m. | Shokichi Takakura, Taiji Suzuki
stat.ML updates on arXiv.org arxiv.org
Abstract: Despite the great success of Transformer networks in various applications such as natural language processing and computer vision, their theoretical aspects are not well understood. In this paper, we study the approximation and estimation ability of Transformers as sequence-to-sequence functions with infinite dimensional inputs. Although inputs and outputs are both infinite dimensional, we show that when the target function has anisotropic smoothness, Transformers can avoid the curse of dimensionality due to their feature extraction ability …
abstract applications approximation arxiv computer computer vision cs.lg functions language language processing natural natural language natural language processing networks paper processing stat.ml study success transformer transformers type vision
More from arxiv.org / stat.ML updates on arXiv.org
Mixture of partially linear experts
4 hours ago |
arxiv.org
Adaptive deep learning for nonlinear time series models
1 day, 4 hours ago |
arxiv.org
A Full Adagrad algorithm with O(Nd) operations
1 day, 4 hours ago |
arxiv.org
Minimax Regret Learning for Data with Heterogeneous Subgroups
1 day, 4 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH
@ Deloitte | Kuala Lumpur, MY