Feb. 6, 2024, 5:44 a.m. | Shivanshu Shekhar Tanishq Dubey Koyel Mukherjee Apoorv Saxena Atharv Tyagi Nishanth Kotla

cs.LG updates on arXiv.org arxiv.org

Generative AI and LLMs in particular are heavily used nowadays for various document processing tasks such as question answering and summarization. However, different LLMs come with different capabilities for different tasks as well as with different costs, tokenization, and latency. In fact, enterprises are already incurring huge costs of operating or using LLMs for their respective use cases.
In this work, we propose optimizing the usage costs of LLMs by estimating their output quality (without actually invoking the LLMs), and …

capabilities costs cs.ai cs.cl cs.lg document document processing enterprises generative latency llm llms processing question question answering summarization tasks tokenization usage

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Business Intelligence Architect - Specialist

@ Eastman | Hyderabad, IN, 500 008