all AI news
How to Serve LLM Completions in Production
Jan. 18, 2024, 9:19 p.m. | Mateusz Charytoniuk
DEV Community dev.to
Preparations
To start, you need to compile llama.cpp. You can follow their README for instructions.
The server is compiled alongside other targets by default.
Once you have the server running, we can continue. We will use PHP Resonance framework.
Troubleshooting
Obtaining Open-Source LLM
I recommend starting either with llama2 or Mistral. You need to download the pretrained weights and convert them into GGUF format before they can be used with llama.cpp.
Starting Server Without a GPU
ai cpp framework llama llama2 llm mistral php production readme running serve server targets troubleshooting webdev will
More from dev.to / DEV Community
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US