s
March 26, 2024, 4:56 a.m. |

Simon Willison's Weblog simonwillison.net

My binary vector search is better than your FP32 vectors


I'm still trying to get my head around this, but here's what I understand so far.


Embedding vectors as calculated by models such as OpenAI text-embedding-3-small are arrays of floating point values, which look something like this:


[0.0051681744, 0.017187592, -0.018685209, -0.01855924, -0.04725188...] - 1356 elements long


Different embedding models have different lengths, but they tend to be hundreds up to low thousands of numbers. If each float is 32 bits …

arrays binary embedding embeddings floating point head look openai search small something text text-embedding-3 values vector vectors vector search

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US