all AI news
[N] Live and open training of BigScience's 176B multilingual language model has just started
March 16, 2022, 4:38 p.m. | /u/Thomjazz
Machine Learning www.reddit.com
Here are more information on the model, dataset, engineering, training and hardware:
1. **The model**:
* 176B parameters decoder-only architecture (GPT-like)
* 70 layers - 112 attention heads per layers - hidden dimensionality of 14336 - 2048 tokens sequence length
* ALiBi positional embeddings - GeLU activation function
* **Read more**:
* Blog post summarizing how …
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Senior Applied Data Scientist
@ dunnhumby | London
Principal Data Architect - Azure & Big Data
@ MGM Resorts International | Home Office - US, NV