[R] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding | allainews.com

April 22, 2024, 8:46 a.m. | /u/SeawaterFlows

Machine Learning www.reddit.com

**Paper**: [https://arxiv.org/abs/2404.11912](https://arxiv.org/abs/2404.11912)

**Code**: [https://github.com/Infini-AI-Lab/TriForce](https://github.com/Infini-AI-Lab/TriForce)

**Project page**: [https://infini-ai-lab.github.io/TriForce/](https://infini-ai-lab.github.io/TriForce/)

**Abstract**:

>With large language models (LLMs) widely deployed in long content generation recently, there has emerged an increasing demand for efficient long-sequence inference support. However, key-value (KV) cache, which is stored to avoid re-computation, has emerged as a critical bottleneck by growing linearly in size with the sequence length. Due to the auto-regressive nature of LLMs, the entire KV cache will be loaded for every generated token, resulting in low utilization of computational …

abstract machinelearning

More from www.reddit.com / Machine Learning

[D] How do you get better at reading proof in the ML papers, with background … 5 hours ago | www.reddit.com

adversarial basic calculus context +6

[D] The usefulness of the last linear layer of each transformer layer 8 hours ago | www.reddit.com

kind layer linear machinelearning +7

[D] Have someone tried to implement KANs from scratch? 11 hours ago | www.reddit.com

announcement architecture deep learning domain +7

[D] Full causal self-attention layer in O(NlogN) computation steps and O(logN) time rather than O(N^2) … 15 hours ago | www.reddit.com

attention big causal computation +6

[Discussion] MICCAI 2024 decisions 16 hours ago | www.reddit.com

application decisions discuss email +5

What's your favorite paper at ICLR2024? [D] 16 hours ago | www.reddit.com

iclr2024 machinelearning paper

[D] Neurips 2024 submissions 21 hours ago | www.reddit.com

abstract case machinelearning neurips +2

[D] LoRA with Cross Validation 21 hours ago | www.reddit.com

k-fold library lora low +3

[D] Data Labeling Tools 22 hours ago | www.reddit.com

data data labeling image image search +6

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net