May 6, 2024, 2:04 p.m. | /u/Objective-Camel-3726

Machine Learning www.reddit.com

I just noticed some guy created a 120B Instruct variant of Llama 3 by merging it with itself (end result duplication of 60 / 80 layers). He seems to specialize in these Frankenstein models. For the life of me, I really don't understand this trend. These are easy breezy to create with mergekit, and I wonder about their commercial utility in the wild. Bud even concedes its not better than say, GPT-4. So what's the point? Oh wait, he gets …

create easy life llama llama 3 machinelearning merging trend

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US