April 24, 2022, 8:22 a.m. | /u/adenml

Data Science www.reddit.com

I remember when I automated models with crontab and 15 lines of shell

Now you need a huge pile of Airflow, Kafka, Snowflake, Spark, Stitch, Grafana, Presto, Amazon Athena, Redshift etc. behind your XGBoost model.

\>90% of the ML models I've seen weren't even good enough to justify any kind of complex automation. The stupidest models are the clustering ones, the same old k-means fitted 30 times to match with the business knowledge about the client. After 2 years, the …

data data engineering datascience engineering ml popular

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior AI & Data Engineer

@ Bertelsmann | Kuala Lumpur, 14, MY, 50400

Analytics Engineer

@ Reverse Tech | Philippines - Remote