May 14, 2024, 4:50 a.m. | Dominik J. Schindler, Sneha Jha, Xixuan Zhang, Kilian Buehling, Annett Heft, Mauricio Barahona

cs.CL updates on

arXiv:2405.07764v1 Announce Type: new
Abstract: Expanding a dictionary of pre-selected keywords is crucial for tasks in information retrieval, such as database query and online data collection. Here we propose Local Graph-based Dictionary Expansion (LGDE), a method that uses tools from manifold learning and network science for the data-driven discovery of keywords starting from a seed dictionary. At the heart of LGDE lies the creation of a word similarity graph derived from word embeddings and the application of local community detection …

abstract arxiv collection data database data collection data-driven dictionary discovery expansion graph graph-based information keywords manifold network physics.soc-ph query retrieval science seed tasks tools type

