Feb. 12, 2024, 6:20 a.m. | /u/mrcet007

Machine Learning www.reddit.com

Which are good resources or book on efficiently deploying classical ML in production for very high throughput. Say 100k request per seconds for inference & need low latency.

I am not taking about scaling deploying transfomer or neural networks in production. But classical ML model for classicication/regression using say Lightgbm, Xgboost ,RF, SVM etc. for this scale.

Looking for sources which talk about improving model efficency, and data etl efficency for inference etc.

I couldnt find resource for classical ML …

