all AI news
Loading a large, messy csv using data.table fread with cli tools
April 21, 2022, midnight | R on Redwall Analytics
R-bloggers www.r-bloggers.com
Setup
library(data.table)
library(here)
## here() starts at /Users/davidlucey/Desktop/David/Projects/redwall-analytics
library(glue)
## Warning: package 'glue' was built under R version 4.1.2
library(tictoc)
setDTthreads(percent = 90)
path_to_data <- "~/Desktop/David/Projects/uscompanies/data"
path_to_original <- here::here(path_to_data, "uscompanieslist.csv")
Introduction
On a recent side project, we encountered a large (7GB) csv of 30+ million US business names and addresses, which couldn’t be loaded into R, because of corrupted records. While not widely discussed, we have known for some time that it was possible ...
Continue reading: Loading a …
More from www.r-bloggers.com / R-bloggers
Prehistoric: when do authors preprint their papers?
1 day, 13 hours ago |
www.r-bloggers.com
PowerQuery Puzzle solved with R
1 day, 13 hours ago |
www.r-bloggers.com
{emayili} Support for Mailtrap
2 days, 1 hour ago |
www.r-bloggers.com
R Solution for Excel Puzzles
2 days, 7 hours ago |
www.r-bloggers.com
Conducting Simulation Studies in R workshop
2 days, 10 hours ago |
www.r-bloggers.com
rOpenSci News Digest, April 2024
3 days, 1 hour ago |
www.r-bloggers.com
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Social Insights & Data Analyst (Freelance)
@ Media.Monks | Jakarta
Cloud Data Engineer
@ Arkatechture | Portland, ME, USA