all AI news
Topic: benchmark
Suvach -- Generated Hindi QA benchmark
1 day, 6 hours ago |
arxiv.org
GAIA: Redefining AI Assistant Evaluation
1 day, 13 hours ago |
pub.towardsai.net
Calibrating the Mosaic Evaluation Gauntlet
1 day, 14 hours ago |
www.databricks.com
Holmes: Benchmark the Linguistic Competence of Language Models
2 days, 6 hours ago |
arxiv.org
Tracking Transforming Objects: A Benchmark
2 days, 6 hours ago |
arxiv.org
BEST LLMs for Coding, Long Context, Overall Perform
1 week, 1 day ago |
www.youtube.com
GAIA: Redefining AI Assistant Evaluation
1 day, 13 hours ago |
pub.towardsai.net
Holmes: Benchmark the Linguistic Competence of Language Models
2 days, 6 hours ago |
arxiv.org
Suvach -- Generated Hindi QA benchmark
1 day, 6 hours ago |
arxiv.org
Calibrating the Mosaic Evaluation Gauntlet
1 day, 14 hours ago |
www.databricks.com
Items published with this topic over the last 90 days.
Latest
Suvach -- Generated Hindi QA benchmark
1 day, 6 hours ago |
arxiv.org
GAIA: Redefining AI Assistant Evaluation
1 day, 13 hours ago |
pub.towardsai.net
Calibrating the Mosaic Evaluation Gauntlet
1 day, 14 hours ago |
www.databricks.com
Holmes: Benchmark the Linguistic Competence of Language Models
2 days, 6 hours ago |
arxiv.org
Tracking Transforming Objects: A Benchmark
2 days, 6 hours ago |
arxiv.org
BEST LLMs for Coding, Long Context, Overall Perform
1 week, 1 day ago |
www.youtube.com
Topic trend (last 90 days)
Top (last 7 days)
GAIA: Redefining AI Assistant Evaluation
1 day, 13 hours ago |
pub.towardsai.net
Holmes: Benchmark the Linguistic Competence of Language Models
2 days, 6 hours ago |
arxiv.org
Suvach -- Generated Hindi QA benchmark
1 day, 6 hours ago |
arxiv.org
Calibrating the Mosaic Evaluation Gauntlet
1 day, 14 hours ago |
www.databricks.com
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
C003549 Data Analyst (NS) - MON 13 May
@ EMW, Inc. | Braine-l'Alleud, Wallonia, Belgium
Marketing Decision Scientist
@ Meta | Menlo Park, CA | New York City