June 26, 2024, 10:16 a.m. | /u/realAIsation

machinelearningnews www.reddit.com

**Etched is launching its custom chip Sohu,** specifically designed for transformer models. Sohu is *fast*—we're talking **500,000+ tokens per second** on Llama 70B. That's an order of magnitude faster than NVIDIA's upcoming monster GPU, the GB200.

70b chip custom etched faster gb200 gpu llama machinelearningnews nvidia per tokens transformer transformer models

More from www.reddit.com / machinelearningnews

Software Engineer II –Decision Intelligence Delivery and Support

@ Bristol Myers Squibb | Hyderabad

Senior Data Governance Consultant (Remote in US)

@ Resultant | Indianapolis, IN, United States

Power BI Developer

@ Brompton Bicycle | Greenford, England, United Kingdom

VP, Enterprise Applications

@ Blue Yonder | Scottsdale

Data Scientist - Moloco Commerce Media

@ Moloco | Redwood City, California, United States

Senior Backend Engineer (New York)

@ Kalepa | New York City. Hybrid