all AI news
(Okapi) BM25 with using hierarchically clusterized keywords
Hey, all! Hope you are doing well!
Do you know any work which tries to do Okapi BM25 matching using hierarchically clusterized words?
Relabeling all tokens of a subtree to the same value would combine similar words into the same token_id. Lower subtrees imply in closer words This would be a query and document enrichment. And now, with robust word embeddings and clustering algorithms, this approach seems feasible.
Also this is a quite immediate idea so someone must have already done it. Do you know any work on this?