Feb. 14, 2024, 5:46 a.m. | Shentao Yang Tianqi Chen Mingyuan Zhou

cs.CV updates on arXiv.org arxiv.org

Aligning text-to-image diffusion model (T2I) with preference has been gaining increasing research attention. While prior works exist on directly optimizing T2I by preference data, these methods are developed under the bandit assumption of a latent reward on the entire diffusion reverse chain, while ignoring the sequential nature of the generation process. From literature, this may harm the efficacy and efficiency of alignment. In this paper, we take on a finer dense reward perspective and derive a tractable alignment objective that …

attention cs.cv data diffusion diffusion model image image diffusion nature prior research text text-to-image

