all AI news
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models. (arXiv:2312.16693v2 [cs.CV] UPDATED)
cs.CV updates on arXiv.org arxiv.org
Text-guided image-to-video (I2V) generation aims to generate a coherent video
that preserves the identity of the input image and semantically aligns with the
input prompt. Existing methods typically augment pretrained text-to-video (T2V)
models by either concatenating the image with noised video frames channel-wise
before being fed into the model or injecting the image embedding produced by
pretrained image encoders in cross-attention modules. However, the former
approach often necessitates altering the fundamental weights of pretrained T2V
models, thus restricting the model's …
arxiv cs.cv diffusion diffusion models fed general generate identity image image-to-video prompt text text-to-video video wise