Dec. 1, 2023, 9:43 p.m. | Eric Zhù

Towards Data Science - Medium towardsdatascience.com

Why static workload is insufficient and what I learned by comparing HNSWLIB and DiskANN using streaming workload

Image by DALLE-3

Vector databases are built for high-dimensional vector retrieval. Today, many vectors are embeddings generated by deep neural nets like GPTs and CLIP to represent data points such as pieces of text, images, or audio tracks. Embeddings are used in many applications like search engines, recommendation systems, and chatbots. You can index embeddings in a vector database, which uses an Approximate …

audio benchmark clip dalle data databases data engineering embeddings generated gpts images neural nets retrieval streaming text vector vector database vector databases vectors vector search

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne