Feb. 12, 2024, 5:46 a.m. | Angello Hoyos Mariano Rivera

cs.CV updates on arXiv.org arxiv.org

We present a novel approach to enhance the capabilities of VQ-VAE models through the integration of a Residual Encoder and a Residual Pixel Attention layer, named Attentive Residual Encoder (AREN). The objective of our research is to improve the performance of VQ-VAE while maintaining practical parameter levels. The AREN encoder is designed to operate effectively at multiple levels, accommodating diverse architectural complexities. The key innovation is the integration of an inter-pixel auto-attention mechanism into the AREN encoder. This approach allows …

attention capabilities cs.ai cs.cv encoder integration layer multiple novel performance pixel practical research residual through vae

