all AI news
Adaptive RAG: A retrieval technique to reduce LLM token cost for top-k Vector Index retrieval [R]
March 28, 2024, 6:55 p.m. | /u/dxtros
Machine Learning www.reddit.com
We demonstrate a technique which allows to dynamically adapt the number of documents in a top-k retriever RAG prompt using feedback from the LLM. This allows a 4x cost reduction of RAG LLM question answering while maintaining the same level of accuracy. We also show that the method helps explain the lineage of LLM outputs.
The reference implementation works with most models (GPT4, many local models, older GPT-3.5 turbo) and can be used with most vector databases exposing a …
abstract accuracy adapt cost documents feedback index llm machinelearning prompt question question answering rag reduce retrieval token vector
More from www.reddit.com / Machine Learning
How do I convince my superior to do data preprocessing? [D]
1 day, 1 hour ago |
www.reddit.com
[D] Mathematical aspects of tokenization
1 day, 3 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Field Sample Specialist (Air Sampling) - Eurofins Environment Testing – Pueblo, CO
@ Eurofins | Pueblo, CO, United States
Camera Perception Engineer
@ Meta | Sunnyvale, CA