Aug. 8, 2023, 7:49 p.m. | Ervin Szilagyi

DEV Community dev.to

Lately, we can bloc GPT bots from scraping our pages for a site that we control, by setting the following lines in the robots.txt file:



User-agent: GPTBot
Disallow: /


I, myself, found out this from a tweet from Gergely Orosz:



My stance on this is similar to what Gergely is saying. GPT offers no citation to the information it provides. While I did update the robots.txt file on my personal website, I am also cross-posting to DEV. If we …

blog bot bots control discuss found gpt gpt3 gptbot openai robots scraping tweet watercooler

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Senior Machine Learning Engineer

@ BlackStone eIT | Egypt - Remote

Machine Learning Engineer - 2

@ Parspec | Bengaluru, India