Jan. 25, 2024, 7:21 a.m. | Jean-Claude Cote

Towards Data Science - Medium towardsdatascience.com

A Practical guide to optimizing non-equi joins in Spark

Photo by John Lee on Unsplash

Enriching network events with IP geolocation information is a crucial task, especially for organizations like the Canadian Centre for Cyber Security, the national CSIRT of Canada. In this article, we will demonstrate how to optimize Spark SQL joins, specifically focusing on scenarios involving non-equality conditions — a common challenge when working with IP geolocation data.

As cybersecurity practitioners, our reliance on enriching network events …

article canada centre cyber cyber security cybersecurity data engineering events geolocation guide hands-on-tutorials information ipv4 john joins lee network organizations practical security spark sql will

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US