May 3, 2024, 4:14 a.m. | Sindhu Tipirneni, Ravinarayana Adkathimar, Nurendra Choudhary, Gaurush Hiranandani, Rana Ali Amjad, Vassilis N. Ioannidis, Changhe Yuan, Chandan K. Re

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.00988v1 Announce Type: new
Abstract: Despite the remarkable success of Large Language Models (LLMs) in text understanding and generation, their potential for text clustering tasks remains underexplored. We observed that powerful closed-source LLMs provide good quality clusterings of entity sets but are not scalable due to the massive compute power required and the associated costs. Thus, we propose CACTUS (Context-Aware ClusTering with aUgmented triplet losS), a systematic approach that leverages open-source LLMs for efficient and effective supervised clustering of entity …

abstract arxiv clustering compute context cs.cl cs.lg good language language models large language large language models llms massive power quality scalable success tasks text text understanding type understanding

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

GN SONG MT Market Research Data Analyst 11

@ Accenture | Bengaluru, BDC7A