March 14, 2024, 4:42 a.m. | Pooja Srinivas, Fiza Husain, Anjaly Parayil, Ayush Choure, Chetan Bansal, Saravan Rajmohan

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.07927v1 Announce Type: cross
Abstract: Cloud service owners need to continuously monitor their services to ensure high availability and reliability. Gaps in monitoring can lead to delay in incident detection and significant negative customer impact. Current process of monitor creation is ad-hoc and reactive in nature. Developers create monitors using their tribal knowledge and, primarily, a trial and error based process. As a result, monitors often have incomplete coverage which leads to production issues, or, redundancy which results in noise …

abstract arxiv availability cloud cloud service cloud services cs.lg cs.ni current customer data data-driven delay detection developers framework impact incident intelligent monitoring monitors nature negative process reliability service services type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Global Data Architect, AVP - State Street Global Advisors

@ State Street | Boston, Massachusetts

Data Engineer

@ NTT DATA | Pune, MH, IN