Jan. 19, 2024, 3 p.m. | James Briggs

James Briggs www.youtube.com

Using fully local semantic router for agentic AI with llama.cpp LLM and HuggingFace embedding models.

There are many reasons we might decide to use local LLMs rather than use a third-party service like OpenAI. It could be cost, privacy, compliance, or fear of the OpenAI apocalypse. To help you out, we made Semantic Router fully local with local LLMs available viallama.cpp like Mistral 7B.

Using llama.cpp also enables the use of quantized GGUF models, reducing the memory footprint of deployed …

apocalypse compliance cost cpp embedding embedding models fear huggingface llama llm llms openai privacy semantic service

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Software Engineer, Generative AI (C++)

@ SoundHound Inc. | Toronto, Canada