all AI news
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
June 26, 2024, 4:45 a.m. | Byungsoo Jeon, Mengdi Wu, Shiyi Cao, Sunghyun Kim, Sunghyun Park, Neeraj Aggarwal, Colin Unger, Daiyaan Arfeen, Peiyuan Liao, Xupeng Miao, Mohammad Al
cs.LG updates on arXiv.org arxiv.org
Abstract: Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale DNN training by partitioning a DNN into multiple stages, which concurrently perform DNN training for different micro-batches in a pipeline fashion. However, existing pipeline-parallel approaches only consider sequential pipeline stages and thus ignore the topology of a DNN, resulting in missed model-parallel opportunities. This …
abstract arxiv cs.ai cs.dc cs.lg device dnn graph grow improving making multiple networks neural networks partitioning performance pipeline scalability scale stages support systems them train training type
More from arxiv.org / cs.LG updates on arXiv.org
MixerFlow: MLP-Mixer meets Normalising Flows
2 days, 16 hours ago |
arxiv.org
Machine Learning-Enabled Software and System Architecture Frameworks
2 days, 16 hours ago |
arxiv.org
Kernelised Normalising Flows
2 days, 16 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Junior Senior Reliability Engineer
@ NielsenIQ | Bogotá, Colombia
[Job - 15712] Vaga Afirmativa para Mulheres - QA (Automation), SR
@ CI&T | Brazil
Production Reliability Engineer, Trade Desk
@ Jump Trading | Sydney, Australia
Senior Process Engineer, Prenatal
@ BillionToOne | Union City and Menlo Park, CA
Senior Scientist, Sustainability Science and Innovation
@ Microsoft | Redmond, Washington, United States
Data Scientist
@ Ford Motor Company | Chennai, Tamil Nadu, India