April 2, 2024, 12:32 a.m. | Akmal Chaudhri

DEV Community dev.to




Abstract


Continuing our series on using Apache Spark with SingleStore, we'll look at a simple example of how to read the data in a set of local text files, create vector embeddings and save the file data and embeddings in SingleStore using Spark's Structured Streaming.


The notebook file used in this article is available on GitHub.





Create a SingleStore Cloud account


A previous article showed the steps to create a free SingleStore Cloud account. We'll use the following settings: …

abstract apache apache spark apachespark data embeddings example file files look notebook notebooks save series set simple singlestore singlestoredb spark streaming text vector vectordatabase vector embeddings

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Consultant Senior Power BI & Azure - CDI - H/F

@ Talan | Lyon, France