[CVPR'24] LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation | allainews.com

April 4, 2024, 3:33 a.m. | /u/kb_kim

machinelearningnews www.reddit.com

It is the first work to leverage a **Large Langage Model** on Scene Graph Generation task.
Incredibly, we achieve comparable performance to a fully supervised approach in terms of F@K, even when we only use **image captions** in Scene Graph Generation task.
For more details, refer to

paper: [https://arxiv.org/pdf/2310.10404.pdf](https://arxiv.org/pdf/2310.10404.pdf)

code: [https://github.com/rlqja1107/torch-LLM4SGG](https://github.com/rlqja1107/torch-LLM4SGG)

[Overall Framework](https://preview.redd.it/5fmqbz9dsdsc1.png?width=1065&format=png&auto=webp&s=6a72e722b589fccfad01e8152fd9c604a1587931)

[Performance Comparison](https://preview.redd.it/0vv7ll85tdsc1.png?width=1241&format=png&auto=webp&s=9b15139b629f5181f0c0e4623ee1fa3f0b8e1113)

captions cvpr graph image language language models large language large language models machinelearningnews performance terms work

More from www.reddit.com / machinelearningnews

A Survey of RAG and RAU: Advancing Natural Language Processing with Retrieval-Augmented Language Models 6 hours ago | www.reddit.com

language language models language processing machinelearningnews +8

Google DeepMind Introduces Med-Gemini: A Groundbreaking Family of AI Models Revolutionizing Medical Diagnosis and Clinical … 13 hours ago | www.reddit.com

ai models clinical deepmind diagnosis +8

This AI Paper Introduces Llama-3-8B-Instruct-80K-QLoRA: New Horizons in AI Contextual Understanding 23 hours ago | www.reddit.com

ai paper llama machinelearningnews paper +2

FREE AI WEBINAR: 'Using AWS Bedrock & LangChain for Private LLM App Dev' 1 day, 2 hours ago | www.reddit.com

ai webinar app aws aws bedrock +8

Top Artificial Intelligence (AI) Governance Laws and Frameworks 1 day, 2 hours ago | www.reddit.com

artificial artificial intelligence frameworks governance +3

Kolmogorov-Arnold Networks (KANs): A New Era of Interpretability and Accuracy in Deep Learning 1 day, 6 hours ago | www.reddit.com

accuracy deep learning interpretability machinelearningnews +1

Calculating "Time to First Token" (TTFT) for Large Language Models Up to 34Bn Params 1 day, 19 hours ago | www.reddit.com

containers docker hey language +10

This AI Paper from Princeton and Stanford Introduces CRISPR-GPT For Innovative Gene-Editing Enhancements 2 days ago | www.reddit.com

ai paper crispr editing gene +4

Researchers at UC Berkeley Unveil a Novel Interpretation of the U-Net Architecture Through the Lens … 2 days, 15 hours ago | www.reddit.com

architecture berkeley generative hierarchical +7

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town

View on ai-jobs.net