all AI news
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models
Feb. 26, 2024, 5:44 a.m. | Yongchan Kwon, Eric Wu, Kevin Wu, James Zou
cs.LG updates on arXiv.org arxiv.org
Abstract: Quantifying the impact of training data points is crucial for understanding the outputs of machine learning models and for improving the transparency of the AI pipeline. The influence function is a principled and popular data attribution method, but its computational cost often makes it challenging to use. This issue becomes more pronounced in the setting of large language models and text-to-image models. In this work, we propose DataInf, an efficient influence approximation method that is …
abstract arxiv attribution computational cost cs.lg data diffusion diffusion models function impact influence llms lora machine machine learning machine learning models pipeline popular stat.ml training training data transparency type understanding
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
AI Engineering Manager
@ M47 Labs | Barcelona, Catalunya [Cataluña], Spain