July 10, 2023, 4:56 p.m. | /u/alkibijad

Machine Learning www.reddit.com

I’m using a vector database for storing image embeddings and using it for similarity search. If I pick top ten most similar vectors I can sometimes end up inside of an echo chamber with almost “duplicates” or too similar images. I would like to diversify the results so that all the results are close to the input vector but different between themselves.

Are there common patterns/algorithms for this type of diversification?

The idea that I have: I want to pick …

database echo embeddings image images inside machinelearning search vector vector database vectors

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

GN SONG MT Market Research Data Analyst 11

@ Accenture | Bengaluru, BDC7A