From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions | allainews.com

Feb. 28, 2024, 5:49 a.m. | Fabian Retkowski, Alexander Waibel

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.17633v1 Announce Type: new
Abstract: Text segmentation is a fundamental task in natural language processing, where documents are split into contiguous sections. However, prior research in this area has been constrained by limited datasets, which are either small in scale, synthesized, or only contain well-structured documents. In this paper, we address these limitations by introducing a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse. As part of this work, we …

abstract arxiv benchmark cs.cl datasets documents language language processing natural natural language natural language processing novel prior processing research scale segmentation small smart synthesized text type video

More from arxiv.org / cs.CL updates on arXiv.org

ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis 24 minutes ago | arxiv.org

abstract arxiv cs.cl cs.sd +14

LSTM-based Deep Neural Network With A Focus on Sentence Representation for Sequential Sentence Classification in … 24 minutes ago | arxiv.org

abstract arxiv classification cs.cl +13

Improving Text Embeddings with Large Language Models 24 minutes ago | arxiv.org

abstract arxiv cs.cl cs.ir +22

The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation 24 minutes ago | arxiv.org

abstract arxiv behavior belief +22

When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications 24 minutes ago | arxiv.org

abstract applications arxiv attention +19

TRAM: Benchmarking Temporal Reasoning for Large Language Models 24 minutes ago | arxiv.org

abstract arxiv benchmarking benchmarks +17

Multi-hop Question Answering 24 minutes ago | arxiv.org

abstract ai systems arxiv cs.ai +18

Towards a Fluid computer 24 minutes ago | arxiv.org

abstract article arxiv computer +13

CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance Ranking 24 minutes ago | arxiv.org

application arxiv click cs.cl +8

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Senior Applied Data Scientist

@ dunnhumby | London

View on ai-jobs.net

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

View on ai-jobs.net