s
Feb. 3, 2024, 11:13 p.m. |

Simon Willison's Weblog simonwillison.net

Introducing Nomic Embed: A Truly Open Embedding Model


A new text embedding model from Nomic AI which supports 8192 length sequences, claims better scores than many other models (including OpenAI's new text-embedding-3-small) and is available as both a hosted API and a run-yourself model. The model is Apache 2 licensed and Nomic have released the full set of training data and code.


From the accompanying paper: "Full training of nomic-embed-text-v1 can be conducted in a single week on one 8xH100 …

ai apache api embed embedding embeddings openai small text text-embedding-3

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US