Nov. 5, 2023, 6:47 a.m. | Mayank Kothyari, Dhruva Dhingra, Sunita Sarawagi, Soumen Chakrabarti

cs.CL updates on arXiv.org arxiv.org

Existing Text-to-SQL generators require the entire schema to be encoded with
the user text. This is expensive or impractical for large databases with tens
of thousands of columns. Standard dense retrieval techniques are inadequate for
schema subsetting of a large structured database, where the correct semantics
of retrieval demands that we rank sets of schema elements rather than
individual elements. In response, we propose a two-stage process for effective
coverage during retrieval. First, we instruct an LLM to hallucinate a …

arxiv collective database databases hallucination retrieval schema semantics sql standard text text-to-sql

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US