s
June 15, 2024, 2:39 p.m. |

Simon Willison's Weblog simonwillison.net

Using DuckDB for Embeddings and Vector Search


Sören Brunk's comprehensive tutorial combining DuckDB 1.0, a subset of German Wikipedia from Hugging Face (loaded using Parquet), the BGE M3 embedding model and DuckDB's new vss extension for implementing an HNSW vector index.

Via @soebrunk

ai duckdb embedding embeddings extension face german hnsw hugging face index parquet search tutorial vector vector search via wikipedia

Senior Data Engineer

@ Displate | Warsaw

Director of Data Science (f/m/x)

@ AUTO1 Group | Berlin, Germany

Business Intelligence Analyst I [BI Analyst I]

@ Capitec Bank | Stellenbosch, Western Cape, ZA

Data Governance Associate Director

@ Publicis Groupe | London, United Kingdom

Technical Lead - Power BI

@ Birlasoft | INDIA - PUNE - BIRLASOFT OFFICE - HINJAWADI, IN

Data Analyst

@ FirstRand Corporate Centre | 1 First Place, Cnr Simmonds & Pritchard Streets, Johannesburg, 2001