Web: http://arxiv.org/abs/2006.02293

Sept. 23, 2022, 1:12 a.m. | Alicja Gosiewska, Katarzyna Woźnica, Przemysław Biecek

cs.LG updates on arXiv.org arxiv.org

Benchmarks for the evaluation of model performance play an important role in
machine learning. However, there is no established way to describe and create
new benchmarks. What is more, the most common benchmarks use performance
measures that share several limitations. For example, the difference in
performance for two models has no probabilistic interpretation, there is no
reference point to indicate whether they represent a significant improvement,
and it makes no sense to compare such differences between data sets. We
introduce …

arxiv meta performance

More from arxiv.org / cs.LG updates on arXiv.org

Postdoctoral Fellow: ML for autonomous materials discovery

@ Lawrence Berkeley National Lab | Berkeley, CA

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Research Engineer - VFX, Neural Compositing

@ Flawless | Los Angeles, California, United States

[Job-TB] Senior Data Engineer

@ CI&T | Brazil

Data Analytics Engineer

@ The Fork | Paris, France