May 13, 2022, 1:11 a.m. | Daniel Hesslow, Niccoló Zanichelli, Pascal Notin, Iacopo Poli, Debora Marks

cs.LG updates on arXiv.org arxiv.org

In this work we introduce RITA: a suite of autoregressive generative models
for protein sequences, with up to 1.2 billion parameters, trained on over 280
million protein sequences belonging to the UniRef-100 database. Such generative
models hold the promise of greatly accelerating protein design. We conduct the
first systematic study of how capabilities evolve with model size for
autoregressive transformers in the protein domain: we evaluate RITA models in
next amino acid prediction, zero-shot fitness, and enzyme function prediction,
showing …

arxiv bio protein scaling scaling up study

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Strategy & Management - Private Equity Sector - Manager - Consulting - Location OPEN

@ EY | New York City, US, 10001-8604

Data Engineer- People Analytics

@ Volvo Group | Gothenburg, SE, 40531