all AI news
Exploring Hacker News by mapping and analyzing 40 million posts and comments for fun
May 10, 2024, 4:42 p.m. |
Simon Willison's Weblog simonwillison.net
Exploring Hacker News by mapping and analyzing 40 million posts and comments for fun
A real tour de force of data engineering. Wilson Lin fetched 40 million posts and comments from the Hacker News API (using Node.js with a custom multi-process worker pool) and then ran them all through the BGE-M3 embedding model using RunPod, which let him fire up ~150 GPU instances to get the whole run done in a few hours, using a custom RocksDB and Rust queue …
api data data engineering embeddings engineering fun hacker hackernews mapping node node.js pool process ran wilson
More from simonwillison.net / Simon Willison's Weblog
Fast groq-hosted LLMs vs browser jank
1 day, 6 hours ago |
simonwillison.net
AI counter app from my PyCon US keynote
2 days, 4 hours ago |
simonwillison.net
Understand errors and warnings better with Gemini
2 days, 21 hours ago |
simonwillison.net
Commit: Add a shared credentials relationship from twitter.com to x.com
2 days, 23 hours ago |
simonwillison.net
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US