all AI news
PhotoBot: Reference-Guided Interactive Photography via Natural Language
March 22, 2024, 4:46 a.m. | Oliver Limoyo, Jimmy Li, Dmitriy Rivkin, Jonathan Kelly, Gregory Dudek
cs.CV updates on arXiv.org arxiv.org
Abstract: We introduce PhotoBot, a framework for fully automated photo acquisition based on an interplay between high-level human language guidance and a robot photographer. We propose to communicate photography suggestions to the user via reference images that are selected from a curated gallery. We leverage a visual language model (VLM) and an object detector to characterize the reference images via textual descriptions and then use a large language model (LLM) to retrieve relevant reference images based …
abstract acquisition arxiv automated cs.ai cs.cv cs.ro framework guidance human images interactive language language model natural natural language photo photographer photography reference robot suggestions type via visual visual language model
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Principal Machine Learning Engineer (AI, NLP, LLM, Generative AI)
@ Palo Alto Networks | Santa Clara, CA, United States
Consultant Senior Data Engineer F/H
@ Devoteam | Nantes, France