June 22, 2023, 9:57 a.m. | Andrew Betts

DEV Community dev.to

AI-powered content generation has exploded in popularity recently, with bots like ChatGPT and Bard, but the giant amounts of data these bots require comes from harvesting the web. What if you don’t want your content feeding the bots? Some respect robots.txt, others notice a new ‘noai’ header tag.


An article in Vice recently drew attention to the way AI bots are harvesting the web, in some cases quite aggressively. Site owners quite reasonably want to protect the originality …

ai ai-powered bard bots chatgpt data http robots scraping web

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York