[R] Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model | allainews.com

Dec. 6, 2023, 11:19 a.m. | /u/APaperADay

Machine Learning www.reddit.com

**arXiv**: [https://arxiv.org/abs/2310.17653](https://arxiv.org/abs/2310.17653)

**OpenReview**: [https://openreview.net/forum?id=m50eKHCttz](https://openreview.net/forum?id=m50eKHCttz)

**Abstract**:

>Training deep networks requires various design decisions regarding for instance their architecture, data augmentation, or optimization. In this work, we find these training variations to result in networks learning unique feature sets from the data. Using public model libraries comprising thousands of models trained on canonical datasets like ImageNet, we observe that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other -- independent of overall performance. Given any …

abstract architecture augmentation canonical data datasets decisions design feature imagenet instance libraries machinelearning networks observe optimization pretrained models public training work

More from www.reddit.com / Machine Learning

[P] [D] Is inference time the important performance metric for ML Models on edge/mobile? 7 hours ago | www.reddit.com

apps devices edge embed +15

[D] Any-dimensional equivariant neural networks 8 hours ago | www.reddit.com

abstract assumptions authors cases +18

How are large network attack datasets made? [p] 12 hours ago | www.reddit.com

attacks datasets detection free +5

A Multi-Agent game where LLMs must trick each other as humans until one gets caught … 15 hours ago | www.reddit.com

agent fun game humans +7

[D] How reliable is RAG currently? 15 hours ago | www.reddit.com

context context window documents machinelearning +5

[N] New Challenges in DIAMBRA Arena: 3 epic additions to our lineup of RL environments! 16 hours ago | www.reddit.com

arena challenges environments epic +1

[R] An Analysis of Linear Time Series Forecasting Models 18 hours ago | www.reddit.com

abstract analysis forecasting form +9

[D] The "it" in AI models is really just the dataset? 18 hours ago | www.reddit.com

ai models dataset machinelearning

[D] Analysis of Time To First Token (TTFT) of LLMs (10B-34B) 21 hours ago | www.reddit.com

analysis containers docker hey +10

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net