G2T: A Simple but Effective Framework for Topic Modeling based on Pretrained Language Model and Community Detection. (arXiv:2304.06653v2 [cs.CL] UPDATED) | allainews.com

April 17, 2023, 8:22 p.m. | Leihang Zhang, Jiapeng Liu, Qiang Yan

cs.CL updates on arXiv.org arxiv.org

It has been reported that clustering-based topic models, which cluster
high-quality sentence embeddings with an appropriate word selection method, can
generate better topics than generative probabilistic topic models. However,
these approaches suffer from the inability to select appropriate parameters and
incomplete models that overlook the quantitative relation between words with
topics and topics with text. To solve these issues, we propose graph to topic
(G2T), a simple but effective framework for topic modelling. The framework is
composed of four modules. …

arxiv cluster clustering community detection embeddings framework generative graph language language model modeling modelling modules pretrained language model quality quantitative text topic modeling topics word words

More from arxiv.org / cs.CL updates on arXiv.org

Sparse is Enough in Fine-tuning Pre-trained Large Language Models 19 hours ago | arxiv.org

arxiv cs.ai cs.cl cs.lg +6

On the Learnability of Watermarks for Language Models 19 hours ago | arxiv.org

abstract arxiv cs.cl cs.cr +17

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization 19 hours ago | arxiv.org

abstract arxiv capabilities cs.ai +14

Evaluating Generative Ad Hoc Information Retrieval 19 hours ago | arxiv.org

abstract advances arxiv cs.cl +19

Language Models As Semantic Indexers 19 hours ago | arxiv.org

arxiv cs.cl cs.ir cs.lg +4

Large language models can accurately predict searcher preferences 19 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +16

On the Reliability of Watermarks for Large Language Models 19 hours ago | arxiv.org

abstract arxiv become bots +28

A Watermark for Large Language Models 19 hours ago | arxiv.org

abstract arxiv cs.cl cs.cr +16

CreoleVal: Multilingual Multitask Benchmarks for Creoles 19 hours ago | arxiv.org

abstract annotated data arxiv benchmarks +14

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Data Science Analyst

@ Mayo Clinic | AZ, United States

View on ai-jobs.net

Sr. Data Scientist (Network Engineering)

@ SpaceX | Redmond, WA

View on ai-jobs.net