s
May 14, 2024, 8:42 p.m. |

Simon Willison's Weblog simonwillison.net

Context caching for Google Gemini


Another new Gemini feature announced today. Long context models enable answering questions against large chunks of text, but the price of those long prompts can be prohibitive—$3.50/million for Gemini Pro 1.5 up to 128,000 tokens and $7/million beyond that.

Context caching offers a price optimization, where the long prefix prompt can be reused between requests, halving the cost per prompt but at an additional cost of $4.50 / 1 million tokens per hour to keep …

ai beyond caching context feature gemini gemini pro generativeai google google gemini llms optimization price price optimization pro 1.5 prompt promptengineering prompts questions text tokens

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Intern - Robotics Industrial Engineer Summer 2024

@ Vitesco Technologies | Seguin, US