s
Jan. 23, 2024, 2:14 a.m. |

Simon Willison's Weblog simonwillison.net

Prompt Lookup Decoding


Really neat LLM optimization trick by Apoorv Saxena, who observed that it's common for sequences of tokens in LLM input to be reflected by the output - snippets included in a summarization, for example.


Apoorv's code performs a simple search for such prefixes and uses them to populate a set of suggested candidate IDs during LLM token generation.


The result appears to provide around a 2.4x speed-up in generating outputs!


Via @abacaj

ai code decoding example generativeai llm llms optimization prompt search set simple summarization them tokens trick

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US