all AI news
Serving fastchat on single GPU and 5 models!
April 29, 2024, 7:58 a.m. | /u/SuperSecureHuman
Deep Learning www.reddit.com
I was able to squeeze in 5 language models using VLLM, served via fastchat.
Here is a public instance for 72 hrs - Pls note that the battle mode is broken, other tabs work.
[https://c8168701070daa5bf3.gradio.live/](https://c8168701070daa5bf3.gradio.live/)
Llama 3 BB (8K context)
Gemma 8B (8k Context)
Phi 3 128K (18K context)
DeciLM 7B (8K Context)
Stable LM 1.6B (4k Context)
Is it stupid? …
a100 deeplearning gpu instance language language models multiple playing public serve tools via work
More from www.reddit.com / Deep Learning
[Reading] Deeplearning by goodfellow
1 day, 3 hours ago |
www.reddit.com
Linearizing Large Language Models
1 day, 19 hours ago |
www.reddit.com
Converting Soft tokens to Hard tokens in Llama2
1 day, 21 hours ago |
www.reddit.com
Detection of free parking spaces
2 days, 4 hours ago |
www.reddit.com
Language model for TimeSeries Forecasting from Amazon
3 days, 11 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US