all AI news
Construction of a Japanese Financial Benchmark for Large Language Models
March 25, 2024, 4:47 a.m. | Masanori Hirano
cs.CL updates on arXiv.org arxiv.org
Abstract: With the recent development of large language models (LLMs), models that focus on certain domains and languages have been discussed for their necessity. There is also a growing need for benchmarks to evaluate the performance of current LLMs in each domain. Therefore, in this study, we constructed a benchmark comprising multiple tasks specific to the Japanese and financial domains and performed benchmark measurements on some models. Consequently, we confirmed that GPT-4 is currently outstanding, and …
abstract arxiv benchmark benchmarks construction cs.cl current development domain domains financial focus japanese language language models languages large language large language models llms performance q-fin.cp study type
More from arxiv.org / cs.CL updates on arXiv.org
Benchmarking LLMs via Uncertainty Quantification
2 days, 6 hours ago |
arxiv.org
CARE: Extracting Experimental Findings From Clinical Literature
2 days, 6 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineering Manager, Generative AI - Characters
@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA
Senior Operations Research Analyst / Predictive Modeler
@ LinQuest | Colorado Springs, Colorado, United States