March 29, 2024, 4:46 a.m. | Jiawei Wang, Kai Hu, Zhuoyao Zhong, Lei Sun, Qiang Huo

cs.CV updates on arXiv.org arxiv.org

arXiv:2401.11874v2 Announce Type: replace
Abstract: Document structure analysis (aka document layout analysis) is crucial for understanding the physical layout and logical structure of documents, with applications in information retrieval, document summarization, knowledge extraction, etc. In this paper, we concentrate on Hierarchical Document Structure Analysis (HDSA) to explore hierarchical relationships within structured documents created using authoring software employing hierarchical schemas, such as LaTeX, Microsoft Word, and HTML. To comprehensively analyze hierarchical document structures, we propose a tree construction based approach that …

abstract analysis applications arxiv construct construction cs.cv document documents etc explore extraction hierarchical information knowledge paper relationships retrieval summarization tree type understanding

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior ML Engineer

@ Carousell Group | Ho Chi Minh City, Vietnam

Data and Insight Analyst

@ Cotiviti | Remote, United States