Feb. 4, 2024, 2:42 a.m. | /u/tsujuifu

Machine Learning www.reddit.com

**\[ICLR'24 Spotlight\] Guiding Instruction-based Image Editing via Multimodal Large Language Models**

MLLM-guided Instruction-based Image Editing (MGIE) can follow user instructions to edit images
Paper: [https://openreview.net/forum?id=S1RKWSyZ2Y](https://openreview.net/forum?id=S1RKWSyZ2Y)
Project: [https://mllm-ie.github.io](https://mllm-ie.github.io/)

https://preview.redd.it/7abn9yflehgc1.png?width=3183&format=png&auto=webp&s=9fc6c301f49ffaaf1c293c8f5925c603c8c7dc24

The code/checkpoint is also open-sourced 🔥
Apple's official repo: [https://github.com/apple/ml-mgie](https://github.com/apple/ml-mgie)
Repo w/ Gradio demo: [https://github.com/tsujuifu/pytorch\_mgie](https://github.com/tsujuifu/pytorch_mgie)

https://preview.redd.it/hyqngv8nehgc1.png?width=3736&format=png&auto=webp&s=3a70483a7bea6e16500370cee5879e605fe7d51d

apple code editing iclr image language language models large language large language models machinelearning multimodal releases spotlight via

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne