[D] Essentials of Multi-modal/Visual-Language models (A video) | allainews.com

May 27, 2023, 3:48 p.m. | /u/AvvYaa

Machine Learning www.reddit.com

I just uploaded a video on my Youtube covering all the major techniques and challenges for training multi-modal models that can combine multiple input sources like images, text, audio, etc to perform amazing cross-modal tasks like text-image retrieval, multimodal vector arithmetic, visual question answering, and language modelling.

I thought it was a good time to make a video about this topic since more and more recent LLMs are moving away from text-only into visual-language domains (GPT-4, PaLM-2, etc). So in …

audio challenges etc good image images language language models machinelearning major modelling multimodal multiple question answering retrieval text text-image thought training vector video youtube

More from www.reddit.com / Machine Learning

[N] Snowflake releases open (Apache 2.0) 128x3B MoE model 7 hours ago | www.reddit.com

apache apache 2.0 machinelearning moe +2

[D] Why would such a simple sentence break an LLM? 8 hours ago | www.reddit.com

copilot disadvantages german gpt4 +7

[R] I made an app to predict ICML paper acceptance from reviews 12 hours ago | www.reddit.com

analysis conferences iclr machinelearning +6

[R] SpaceByte: Towards Deleting Tokenization from Large Language Modeling - Rice University 2024 - Practically … 12 hours ago | www.reddit.com

abstract machinelearning

[D] Keeping track of models and their associated metadata. 14 hours ago | www.reddit.com

industry machinelearning metadata project +1

[D] How researcher think of inductive bias when thinking of creating new/improving foundational models? 21 hours ago | www.reddit.com

bias foundational foundational models improving +14

[R] Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking 1 day, 1 hour ago | www.reddit.com

clip documents encode generalized +15

[D] Practical uses of AI inside companies 1 day, 2 hours ago | www.reddit.com

ai inside companies concrete course +17

Meta does everything OpenAI should be [D] 1 day, 2 hours ago | www.reddit.com

become capabilities commercial everything +9

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Social Insights & Data Analyst (Freelance)

@ Media.Monks | Jakarta

View on ai-jobs.net

Cloud Data Engineer

@ Arkatechture | Portland, ME, USA

View on ai-jobs.net