Jan. 27, 2022, 2:11 a.m. | Dominik Scheinert, Houkun Zhu, Lauritz Thamsen, Morgan K. Geldenhuys, Jonathan Will, Alexander Acker, Odej Kao

cs.LG updates on arXiv.org arxiv.org

Distributed dataflow systems like Spark and Flink enable the use of clusters
for scalable data analytics. While runtime prediction models can be used to
initially select appropriate cluster resources given target runtimes, the
actual runtime performance of dataflow jobs depends on several factors and
varies over time. Yet, in many situations, dynamic scaling can be used to meet
formulated runtime targets despite significant performance variance.


This paper presents Enel, a novel dynamic scaling approach that uses message
propagation on an …

arxiv dataflow distributed graph jobs scaling

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Technology Consultant Master Data Management (w/m/d)

@ SAP | Walldorf, DE, 69190

Research Engineer, Computer Vision, Google Research

@ Google | Nairobi, Kenya