s
March 3, 2024, 7:08 a.m. |

Simon Willison's Weblog simonwillison.net

The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions


How fast can you read a billion semicolon delimited (name;float) lines and output a min/max/mean summary for each distinct name - 13GB total?


Ben Hoyt describes his 9 incrementally improved versions written in Go in detail. The key optimizations involved custom hashmaps, optimized line parsing and splitting the work across multiple CPU cores.


Via Hacker News

billion challenge go max mean solutions summary total versions

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US