May 7, 2024, 9:04 a.m. | /u/kiockete

Machine Learning www.reddit.com

The EOS token used during pretraining marks "end of sequence", but it does not prevent information to flow across potentially unrelated documents. If so why to even include it during pretraining when we can add it later in SFT phase?

documents eos flow information machinelearning marks pre-training pretraining sft token training

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US