all AI news
WAVE: Weight Template for Adaptive Initialization of Variable-sized Models
June 26, 2024, 4:45 a.m. | Fu Feng, Yucheng Xie, Jing Wang, Xin Geng
cs.LG updates on arXiv.org arxiv.org
Abstract: The expansion of model parameters underscores the significance of pre-trained models; however, the constraints encountered during model deployment necessitate models of variable sizes. Consequently, the traditional pre-training and fine-tuning paradigm fails to address the initialization problem when target models are incompatible with pre-trained models. We tackle this issue from a multitasking perspective and introduce \textbf{WAVE}, which incorporates a set of shared \textbf{W}eight templates for \textbf{A}daptive initialization of \textbf{V}ariable-siz\textbf{E}d Models. During initialization, target models will initialize …
abstract arxiv constraints cs.lg deployment expansion fine-tuning however model deployment paradigm parameters pre-trained models pre-training problem significance template training tuning type
More from arxiv.org / cs.LG updates on arXiv.org
MixerFlow: MLP-Mixer meets Normalising Flows
1 day, 2 hours ago |
arxiv.org
Kernelised Normalising Flows
1 day, 2 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Performance Marketing Manager
@ Jerry | New York City
Senior Growth Marketing Manager (FULLY REMOTE)
@ Jerry | Seattle, WA
Growth Marketing Channel Manager
@ Jerry | New York City
Azure Integration Developer - Consultant - Bangalore
@ KPMG India | Bengaluru, Karnataka, India
Director - Technical Program Manager
@ Capital One | Bengaluru, In
Lead Developer-Process Automation -Python Developer
@ Diageo | Bengaluru Karle Town SEZ