Nov. 3, 2022, 1:11 a.m. | Haoze He, Parijat Dube

cs.LG updates on arXiv.org arxiv.org

The convergence of SGD based distributed training algorithms is tied to the
data distribution across workers. Standard partitioning techniques try to
achieve equal-sized partitions with per-class population distribution in
proportion to the total dataset. Partitions having the same overall population
size or even the same number of samples per class may still have Non-IID
distribution in the feature space. In heterogeneous computing environments,
when devices have different computing capabilities, even-sized partitions
across devices can lead to the straggler problem in …

arxiv distributed environment partitioning

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne