April 9, 2024, 5:30 p.m. | /u/gokulPRO

Machine Learning www.reddit.com

Which model right now has best performance with reasonable dataset size and parameter count. I am asking it in terms of efficiency of architecture since I am planning on training from scratch on a different task and might not be able to afford the volume of dataset that was used by SOTA models.

Edit: If only decoder only models are considered, which would you prefer? And if it was encoder-decoder only which would prefer?

architecture count dataset efficiency good machinelearning modal parameters performance planning scratch terms training

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US