all AI news
[R] Seeking input: transformer modification with 25-30% improvement in validation loss across 3 datasets
Nov. 17, 2023, 11:44 a.m. | /u/exmatrixmachina
Machine Learning www.reddit.com
I've been experimenting with some modifications on the transformer architecture (addition of a new standalone component).
Recently I got to something that seems to be an improvement of the validation loss by \~25-30% over vanilla decoder transformers; the task is next token prediction.
My question is if this is significant enough to dedicate more serious effort in (eg. getting more compute credits to create a bigger model, running a …
architecture datasets improvement loss machinelearning something transformer transformer architecture validation validation loss
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Manager, Global Codes Master Data Operations
@ The Coca-Cola Company | Bulgaria - Sofia
Analyst - Ops (Aidvantage)
@ Maximus | Remote, United States
Internship: Machine Learning for Interference rejection in Body Area Networks
@ NXP Semiconductors | Leuven
Junior Data Analyst - Short Term Gas
@ Verisk | Mexico City, Mexico
Data Engineer I - (Remote - US)
@ Mediavine | Austin, Texas, United States - Remote
Catalog Data Manager (Canada)
@ Fullscript | Ottawa, ON