Feb. 6, 2024, 12:53 p.m. | /u/hypergraphs

Machine Learning www.reddit.com

Let's say I have validated an idea for dealing with long contexts in transformers, enabling 32x - 64x longer ctx lengths, as well as reducing inference time for long ctx by 32x - 64x, without losing long-range information compared to vanilla transformers of corresponding ctx length. Training time is a bit slower, due to the models being a bit bigger than vanilla.

Problem is I have limited compute, so am only able to train models below 1Bn regime on a …

budget compute enabling good inference information low machinelearning transformers

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

Lead Software Engineer, Machine Learning

@ Monarch Money | Remote (US)

Investigator, Data Science

@ GSK | Stevenage

Alternance - Assistant.e Chef de Projet Data Business Intelligence (H/F)

@ Pernod Ricard | FR - Paris - The Island

Expert produit Big Data & Data Science - Services Publics - Nantes

@ Sopra Steria | Nantes, France