Feb. 28, 2024, 5:41 a.m. | Sheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.16902v1 Announce Type: new
Abstract: With the rapid scaling of large language models (LLMs), serving numerous LoRAs concurrently has become increasingly impractical, leading to unaffordable costs and necessitating more parameter-efficient finetuning methods. In this work, we introduce Partially Rotation-enhanced Low-Rank Adaptation (PRoLoRA), an intra-layer sharing mechanism comprising four essential components: broadcast reduction, rotation enhancement, partially-sharing refinement, and rectified initialization strategy. As a superset of LoRA, PRoLoRA pertains its advantages, and effectively circumvent the drawbacks of peer parameter-sharing methods with superior …

abstract arxiv become broadcast components costs cs.lg finetuning language language models large language large language models layer llms lora low low-rank adaptation rotation scaling type work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineer, Machine Learning (Tel Aviv)

@ Meta | Tel Aviv, Israel

Senior Data Scientist- Digital Government

@ Oracle | CASABLANCA, Morocco