all AI news
[D] Initializing a Small LLM to Reflect Natural Token Distribution
Jan. 30, 2024, 12:55 p.m. | /u/ez613
Machine Learning www.reddit.com
Is it feasible to set up the model's weights in such a way that the output of the final softmax layer, prior to any training, mirrors the distribution of tokens in the training data?
My initial thought is to initialize all weights and biases to zero, and then modify the softmax layer (which would initially output zeros) by incorporating a pre-calculated vector of observed token probabilities. I haven't come across this approach in my research thus far …
biases data distribution hello layer llm machinelearning natural prior set small softmax thought token tokens training training data weights and biases
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US