GazeCLIP: Towards Enhancing Gaze Estimation via Text Guidance | allainews.com

April 29, 2024, 4:45 a.m. | Jun Wang, Hao Ruan, Mingjie Wang, Chuanghui Zhang, Huachun Li, Jun Zhou

cs.CV updates on arXiv.org arxiv.org

arXiv:2401.00260v3 Announce Type: replace
Abstract: Over the past decade, visual gaze estimation has garnered increasing attention within the research community, owing to its wide-ranging application scenarios. While existing estimation approaches have achieved remarkable success in enhancing prediction accuracy, they primarily infer gaze from single-image signals, neglecting the potential benefits of the currently dominant text guidance. Notably, visual-language collaboration has been extensively explored across various visual tasks, such as image synthesis and manipulation, leveraging the remarkable transferability of large-scale Contrastive Language-Image …

abstract accuracy application arxiv attention benefits community cs.cv guidance image prediction research research community success text type via visual while

More from arxiv.org / cs.CV updates on arXiv.org

CheXmask: a large-scale dataset of anatomical segmentation masks for multi-center chest x-ray images 13 hours ago | arxiv.org

arxiv center cs.cv dataset +10

Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering 13 hours ago | arxiv.org

abstract agent arxiv augment +16

SONIC: Sonar Image Correspondence using Pose Supervised Learning for Imaging Sonars 13 hours ago | arxiv.org

abstract arxiv association cs.cv +18

On Partial Shape Correspondence and Functional Maps 13 hours ago | arxiv.org

abstract apply arxiv cs.cv +10

Hierarchical Side-Tuning for Vision Transformers 13 hours ago | arxiv.org

abstract arxiv challenge computational +18

DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models 13 hours ago | arxiv.org

animation arxiv cs.cv cs.gr +7

Local Padding in Patch-Based GANs for Seamless Infinite-Sized Texture Synthesis 13 hours ago | arxiv.org

arxiv cs.cv eess.iv gans +5

Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition 13 hours ago | arxiv.org

abstract action recognition applications arxiv +21

Intriguing Property and Counterfactual Explanation of GAN for Remote Sensing Image Generation 13 hours ago | arxiv.org

arxiv counterfactual cs.cv eess.iv +7

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net