April 11, 2024, 2:37 a.m. | /u/pidoyu

Machine Learning www.reddit.com

Hello everyone!

I would like to share our recent CVPR work, hoping to spread our simple ideas.

**\[TL;DR\]** The auto-regression model can predict labels from just an input image, without a predefined query gallery (e.g., CLIP-like models) or predefined class concepts (e.g., VGG/ResNet-like models). The model predicts top-K labels, e.g., **top-100**, from the entire textual space (any label).

For more details, please visit our paper and project: [https://github.com/kaiyuyue/nxtp](https://github.com/kaiyuyue/nxtp).

Your thoughts and feedback are appreciated. Thank you very much!

\----- figure …

auto class clip concepts cvpr hello ideas image labels machinelearning object query recognition regression resnet simple vgg work

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore