March 21, 2022, 1:12 a.m. | Carl Pearson, Aurya Javeed, Karen Devine

cs.LG updates on arXiv.org arxiv.org

We present a new strategy for automatically exploring the design space of key
CUDA+MPI programs and providing design rules that discriminate slow from fast
implementations. In such programs, the order of operations (e.g., GPU kernels,
MPI communication) and assignment of operations to resources (e.g., GPU
streams) makes the space of possible designs enormous. Systems experts have the
task of redesigning and reoptimizing these programs to effectively utilize each
new platform. This work provides a prototype tool to reduce that burden. …

arxiv cuda design learning machine machine learning rules

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analytics & Insight Specialist, Customer Success

@ Fortinet | Ottawa, ON, Canada

Account Director, ChatGPT Enterprise - Majors

@ OpenAI | Remote - Paris