s
March 26, 2024, 4:56 a.m. |

Simon Willison's Weblog simonwillison.net

My binary vector search is better than your FP32 vectors


I'm still trying to get my head around this, but here's what I understand so far.


Embedding vectors as calculated by models such as OpenAI text-embedding-3-small are arrays of floating point values, which look something like this:


[0.0051681744, 0.017187592, -0.018685209, -0.01855924, -0.04725188...] - 1356 elements long


Different embedding models have different lengths, but they tend to be hundreds up to low thousands of numbers. If each float is 32 bits …

arrays binary embedding embeddings floating point head look openai search small something text text-embedding-3 values vector vectors vector search

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Developer AI Senior Staff Engineer, Machine Learning

@ Google | Sunnyvale, CA, USA; New York City, USA

Engineer* Cloud & Data Operations (f/m/d)

@ SICK Sensor Intelligence | Waldkirch (bei Freiburg), DE, 79183