July 5, 2023, 1:56 p.m. | /u/korec1234

Machine Learning www.reddit.com

Inspired by Andrej Karpathy's nanoGPT we improve the repo for pre-training T5 model in PyTorch. **In \~16 hours on a single GPU, we achieve 40.7 RougeL on the SNI benchmark, compared to 40.9 RougeL of the original model pre-trained on 150x more data!**

Key upgrade in nanoT5 v2: We've leveraged BF16 precision and utilise a simplified T5 model implementation based on Huggingface's design. New implementation is easy-to-read and compatible with the HF's checkpoints. **Pre-training is now 2x faster than our …

andrej karpathy benchmark data gpu machinelearning nanogpt performance pre-training pytorch training

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US