May 7, 2024, 4:43 a.m. | Tobias Groot, Matias Valdenegro-Toro

cs.LG updates on arXiv.org arxiv.org

arXiv:2405.02917v1 Announce Type: cross
Abstract: Language and Vision-Language Models (LLMs/VLMs) have revolutionized the field of AI by their ability to generate human-like text and understand images, but ensuring their reliability is crucial. This paper aims to evaluate the ability of LLMs (GPT4, GPT-3.5, LLaMA2, and PaLM 2) and VLMs (GPT4V and Gemini Pro Vision) to estimate their verbalized uncertainty via prompting. We propose the new Japanese Uncertain Scenes (JUS) dataset, aimed at testing VLM capabilities via difficult queries and object …

abstract arxiv cs.cl cs.cv cs.lg evaluation generate gpt gpt-3 gpt-3.5 gpt4 human human-like images key language language models large language llama2 llms palm palm 2 paper reliability text type uncertainty vision vision-language vision-language models vlms

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US