April 14, 2024, 5:47 p.m. | /u/SeawaterFlows

Machine Learning www.reddit.com

**Paper**: [https://arxiv.org/abs/2404.05405](https://arxiv.org/abs/2404.05405)

**Abstract**:

>Scaling laws describe the relationship between the size of language models and their capabilities. Unlike prior studies that evaluate a model's capability via loss or benchmarks, we estimate the number of knowledge bits a model stores. We focus on factual knowledge represented as tuples, such as (USA, capital, Washington D.C.) from a Wikipedia page. Through multiple controlled datasets, we establish that language models can and only can store 2 bits of knowledge per parameter, even when quantized …

abstract benchmarks capabilities capability capital datasets focus knowledge language language models laws loss machinelearning multiple page prior relationship scaling stores studies through tuples usa via washington wikipedia

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US