March 14, 2024, 4:48 a.m. | Changbing Yang, Garrett Nicolai, Miikka Silfverberg

cs.CL updates on

arXiv:2403.08189v1 Announce Type: new
Abstract: We investigate automatic interlinear glossing in low-resource settings. We augment a hard-attentional neural model with embedded translation information extracted from interlinear glossed text. After encoding these translations using large language models, specifically BERT and T5, we introduce a character-level decoder for generating glossed output. Aided by these enhancements, our model demonstrates an average improvement of 3.97\%-points over the previous state of the art on datasets from the SIGMORPHON 2023 Shared Task on Interlinear Glossing. In …

abstract arxiv automated bert decoder embedded encoding information language language models large language large language models low text translation type

