EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention | allainews.com

March 27, 2024, 4:42 a.m. | Zhen Tian, Wayne Xin Zhao, Changwang Zhang, Xin Zhao, Zhongrui Ma, Ji-Rong Wen

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.17729v1 Announce Type: cross
Abstract: To capture user preference, transformer models have been widely applied to model sequential user behavior data. The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a sequence. Due to the permutation-equivariant nature, positional encoding is used to enhance the attention between token representations. In this setting, the pairwise attention scores can be derived by both semantic difference and positional difference. However, prior studies often model the two …

abstract architecture arxiv attention behavior core cs.ir cs.lg data encoding lies modeling nature positional encoding self-attention transformer transformer architecture transformer models type vector

More from arxiv.org / cs.LG updates on arXiv.org

Multifidelity domain decomposition-based physics-informed neural networks and operators for time-dependent problems an hour ago | arxiv.org

abstract arxiv bias combination +15

Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation an hour ago | arxiv.org

abstract arxiv cs.cl cs.lg +12

Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition an hour ago | arxiv.org

arxiv cs.lg cs.mm multimodal +5

Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models an hour ago | arxiv.org

abstract arxiv context cs.cl +13

Evaluating Large Language Models for Health-related Queries with Presuppositions an hour ago | arxiv.org

abstract arxiv corporations cs.ai +18

Reacting like Humans: Incorporating Intrinsic Human Behaviors into NAO through Sound-Based Reactions to Fearful and … an hour ago | arxiv.org

abstract arxiv cs.ai cs.lg +20

Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition an hour ago | arxiv.org

abstract advances arxiv asr +26

Wireless Network Digital Twin for 6G: Generative AI as A Key Enabler an hour ago | arxiv.org

abstract architectures arxiv attention +18

Optimal Embedding Dimension for Sparse Subspace Embeddings an hour ago | arxiv.org

abstract arxiv big cs.ds +13

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Director, Global Success Business Intelligence

@ Salesforce | Texas - Austin

View on ai-jobs.net

Deep Learning Compiler Engineer - MLIR

@ NVIDIA | US, CA, Santa Clara

View on ai-jobs.net

Commerce Data Engineer (Remote)

@ CrowdStrike | USA TX Remote

View on ai-jobs.net