Jan. 18, 2024, 12:23 p.m. | Chloe Caron

DEV Community dev.to

Bad data quality can arise in any type of data, be it numerical, textual or other. As we saw in the last article of this series, LLMs like OpenAI are quite effective at detecting anomalies in textual data. However, the OpenAI anomaly detector really struggled with numerical data, reaching an accuracy of 68% even after applying multiple forms of prompt engineering (compared to ~100% for textual data).


This is to be expected since OpenAI is a large language model and …

accuracy anomaly article bigquery build data dataengineering data quality datascience llms numerical openai quality series textual type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US