Sept. 28, 2022, 9:38 a.m. | /u/eternalmathstudent

Computer Vision www.reddit.com

I understand CBOW and skip-gram and their respective architectures and the intuition behind the model to a good extent. However I have the following 2 burning questions

1. Consider **CBOW** with **4 context words**, why the input layer has **4 full-vocabulary length one-hot vectors** to represent these 4 words and take average of them? Why can't it be just **1 vocabulary length vector with 4 ones** (in otherwords **4-hot vector**)?
2. **CBOW** takes inputs as context words and predict a …

computervision word2vec

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

GN SONG MT Market Research Data Analyst 11

@ Accenture | Bengaluru, BDC7A