all AI news
Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model
Feb. 23, 2024, 5:49 a.m. | Zhiwei He, Xing Wang, Wenxiang Jiao, Zhuosheng Zhang, Rui Wang, Shuming Shi, Zhaopeng Tu
cs.CL updates on arXiv.org arxiv.org
Abstract: Insufficient modeling of human preferences within the reward model is a major obstacle for leveraging human feedback to improve translation quality. Fortunately, quality estimation (QE), which predicts the quality of a given translation without reference, has achieved impressive alignment with human evaluations in the last two years. In this work, we investigate the potential of employing the QE model as the reward model (the QE-based reward model) to predict human preferences for feedback training. We …
arxiv cs.ai cs.cl exploration feedback human human feedback machine machine translation quality reward model translation type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Security Data Engineer
@ ASML | Veldhoven, Building 08, Netherlands
Data Engineer
@ Parsons Corporation | Pune - Business Bay
Data Engineer
@ Parsons Corporation | Bengaluru, Velankani Tech Park