GLID: Pre-training a Generalist Encoder-Decoder Vision Model | allainews.com

April 12, 2024, 4:45 a.m. | Jihao Liu, Jinliang Zheng, Yu Liu, Hongsheng Li

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.07603v1 Announce Type: new
Abstract: This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks. While self-supervised pre-training approaches, e.g., Masked Autoencoder, have shown success in transfer learning, task-specific sub-architectures are still required to be appended for different downstream tasks, which cannot enjoy the benefits of large-scale pre-training. GLID overcomes this challenge by allowing the pre-trained generalist encoder-decoder to be fine-tuned on various vision tasks with minimal task-specific architecture modifications. In the GLID …

abstract architectures arxiv autoencoder benefits computer computer vision cs.cv decoder encoder encoder-decoder masked autoencoder paper pre-training success tasks training transfer transfer learning type vision

More from arxiv.org / cs.CV updates on arXiv.org

A survey on deep learning in medical image registration: new technologies, uncertainty, evaluation metrics, and … 7 hours ago | arxiv.org

abstract arxiv beyond cs.cv +16

Enhancing Super-Resolution Networks through Realistic Thick-Slice CT Simulation 7 hours ago | arxiv.org

abstract acquisition arxiv cs.ai +20

TransRUPNet for Improved Polyp Segmentation 7 hours ago | arxiv.org

arxiv cs.cv eess.iv segmentation +1

An interpretable machine learning system for colorectal cancer diagnosis from pathology slides 7 hours ago | arxiv.org

abstract artificial artificial intelligence arxiv +19

Attention is All They Need: Exploring the Media Archaeology of the Computer Vision Research Paper 7 hours ago | arxiv.org

abstract archaeology arxiv attention +22

Refining Remote Photoplethysmography Architectures using CKA and Empirical Methods 7 hours ago | arxiv.org

abstract architecture architectures arxiv +8

Learning to Complement with Multiple Humans 7 hours ago | arxiv.org

abstract adoption arxiv assumptions +12

HiH: A Multi-modal Hierarchy in Hierarchy Network for Unconstrained Gait Recognition 7 hours ago | arxiv.org

abstract advances arxiv challenges +12

Image-Based Virtual Try-On: A Survey 7 hours ago | arxiv.org

arxiv cs.cv image survey +3

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

View on ai-jobs.net

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India

View on ai-jobs.net