April 18, 2024, 4:47 a.m. | J. Pablo Mu\~noz, Jinjie Yuan, Nilesh Jain

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.10934v1 Announce Type: cross
Abstract: Recently, several approaches successfully demonstrated that weight-sharing Neural Architecture Search (NAS) can effectively explore a search space of elastic low-rank adapters (LoRA), allowing the parameter-efficient fine-tuning (PEFT) and compression of large language models. In this paper, we introduce a novel approach called Shears, demonstrating how the integration of cost-effective sparsity and a proposed Neural Low-rank adapter Search (NLS) algorithm can further improve the efficiency of PEFT approaches. Results demonstrate the benefits of Shears compared to …

abstract adapter architecture arxiv compression cs.ai cs.cl cs.lg elastic explore fine-tuning integration language language models large language large language models lora low nas neural architecture search novel paper peft search space sparsity type unstructured

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US