ShortGPT: Layers in Large Language Models are More Redundant Than You Expect | allainews.com

March 11, 2024, 11:44 a.m. | /u/SunsetOneSix

Natural Language Processing www.reddit.com

**Paper**: [https://arxiv.org/abs/2403.03853](https://arxiv.org/abs/2403.03853)

**Abstract**:

>As Large Language Models (LLMs) continue to advance in performance, their size has escalated significantly, with current LLMs containing billions or even trillions of parameters. However, in this study, we discovered that many layers of LLMs exhibit high similarity, and some layers play a negligible role in network functionality. Based on this observation, we define a metric called **Block Influence** (**BI**) to gauge the significance of each layer in LLMs. We then propose a straightforward pruning approach: …

abstract advance block current however language language models languagetechnology large language large language models llms network observation parameters performance role study

More from www.reddit.com / Natural Language Processing

PhD in Linguistics: Which skills should I focus on? 1 day, 12 hours ago | www.reddit.com

communication computer computer science fields +12

Is the MA in computational linguistics that bad in Tubingen ? 1 day, 20 hours ago | www.reddit.com

computational languagetechnology linguistics

Which NLP-master programs in Europe are more cs-leaning? 5 days, 14 hours ago | www.reddit.com

computational english europe germany +12

What do you think is the state of the art technique for matching a piece … 1 week ago | www.reddit.com

art city database example +9

Multilabel text classification on unlabled data 1 week, 1 day ago | www.reddit.com

classification data finance isn +11

I made a text-game where all the LLMs trick each other pretending to be humans. … 1 week, 1 day ago | www.reddit.com

game humans languagetechnology llms +3

Help with fraud recognition 1 week, 2 days ago | www.reddit.com

bank code country detection +7

AI-proof language-related jobs in the United States? 1 week, 3 days ago | www.reddit.com

jobs language languagetechnology management +4

Leveling up RAG 1 week, 3 days ago | www.reddit.com

advanced advice cleaning context +8

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net