Creating novel views from a single image has achieved tremendous strides with
advanced autoregressive models. Although recent methods generate high-quality
novel views, synthesizing with only one explicit or implicit 3D geometry has a
trade-off between two objectives that we call the ``seesaw'' problem: 1)
preserving reprojected contents and 2) completing realistic out-of-view
regions. Also, autoregressive models require a considerable computational cost.
In this paper, we propose a single-image view synthesis framework for
mitigating the seesaw problem. The proposed model is …

