Aug. 8, 2023, 10:57 p.m. | /u/Apprehensive-War8915

Machine Learning www.reddit.com

I have a regression model based on CNN, works reasonably well with less than 1M parameters. I am trying to check how Visual Transformer (ViT) will perform on this task, but due to lack of pooling in ViT, model size is considerably large (\~10M parameters). Do ViT have anything equivalent to pooling to reduce number of parameters?

If not then that reduces applicability of ViT to large models on large dataset dataset only. For smaller tasks with small dataset, CNN …

check cnn machinelearning pooling regression transformer transformers vit

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Scientist

@ ITE Management | New York City, United States