April 2, 2024, 12:32 a.m. | Akmal Chaudhri

DEV Community dev.to




Abstract


Continuing our series on using Apache Spark with SingleStore, we'll look at a simple example of how to read the data in a set of local text files, create vector embeddings and save the file data and embeddings in SingleStore using Spark's Structured Streaming.


The notebook file used in this article is available on GitHub.





Create a SingleStore Cloud account


A previous article showed the steps to create a free SingleStore Cloud account. We'll use the following settings: …

abstract apache apache spark apachespark data embeddings example file files look notebook notebooks save series set simple singlestore singlestoredb spark streaming text vector vectordatabase vector embeddings

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US