Nov. 17, 2023, 11:44 a.m. | /u/exmatrixmachina

Machine Learning

Long time lurker here. Made an account just to post this.

I've been experimenting with some modifications on the transformer architecture (addition of a new standalone component).

Recently I got to something that seems to be an improvement of the validation loss by \~25-30% over vanilla decoder transformers; the task is next token prediction.

My question is if this is significant enough to dedicate more serious effort in (eg. getting more compute credits to create a bigger model, running a …

architecture datasets improvement loss machinelearning something transformer transformer architecture validation validation loss

Manager, Global Codes Master Data Operations

@ The Coca-Cola Company | Bulgaria - Sofia

Analyst - Ops (Aidvantage)

@ Maximus | Remote, United States

Internship: Machine Learning for Interference rejection in Body Area Networks

@ NXP Semiconductors | Leuven

Junior Data Analyst - Short Term Gas

@ Verisk | Mexico City, Mexico

Data Engineer I - (Remote - US)

@ Mediavine | Austin, Texas, United States - Remote

Catalog Data Manager (Canada)

@ Fullscript | Ottawa, ON