Jan. 6, 2022, 1:18 p.m. | Heiko Hotz

Towards Data Science - Medium towardsdatascience.com

How to leverage the Kaggle Python API to download any dataset from their website

Photo by Alexander Sinn on Unsplash

What is this about?

I recently wanted to use Arxiv dataset (which is licenced under the Creative Commons CC0 1.0 Universal Public Domain Dedication) for one of my NLP projects and tried to leverage the HF dataset hub to download the dataset. When doing so, I received this message:

Image by author

It looks like I had to download …

aws data datasets kaggle machine learning ml nlp project sagemaker

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analytics & Insight Specialist, Customer Success

@ Fortinet | Ottawa, ON, Canada

Account Director, ChatGPT Enterprise - Majors

@ OpenAI | Remote - Paris