May 6, 2024, 1:13 a.m. | Alberto Madin Rivera

DEV Community dev.to

Este código está destinado a extraer información de la página de preguntas Stack Overflow y almacenarla en un archivo de Texto.


Importando las bibliotecas necesarias: request para hacer las solicitudes HTTP, BeautifulSoup de bs4 para analizar HTML y pandas para manejar los datos en forma de Dataframe.



import requests
from bs4 import BeautifulSoup
import pandas as pd

# User agent para protegernos de baneos
headers = {
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/71.0.3578.80 Chrome/71.0.3578.80 Safari/537.36" …

beautifulsoup dataframe devops html http import las overflow pandas python request stack stack overflow webscraping

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US