May 15, 2024, 4:45 a.m. | Carmela Calabrese, Stefano Berti, Giulia Pasquale, Lorenzo Natale

arXiv:2405.08695v1 Announce Type: new
Abstract: Addressing multi-label action recognition in videos represents a significant challenge for robotic applications in dynamic environments, especially when the robot is required to cooperate with humans in tasks that involve objects. Existing methods still struggle to recognize unseen actions or require extensive training data. To overcome these problems, we propose Dual-VCLIP, a unified approach for zero-shot multi-label action recognition. Dual-VCLIP enhances VCLIP, a zero-shot action recognition method, with the DualCoOp method for multi-label image classification. …

