Dec. 10, 2023, 11:57 a.m. | /u/LieTechnical1662

Data Science www.reddit.com

Hi All, i have been given the task to do customer segmentation using clustering. My data is huge, 68M and we use pyspark, i cant convert it to a pandas df. however, i cant find anything solid on DBSCAN in pyspark, can someone pls help me out if they have done it? any resources would be great.

PS the data is financial

clustering customer data datascience dbscan pandas pyspark resources segmentation solid

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

Principle Research Scientist

@ Analog Devices | US, MA, Boston