Feb. 15, 2024, 5:46 a.m. | Sammy Khalife, Yann Ponty, Laurent Bulteau

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.08830v1 Announce Type: cross
Abstract: Several popular language models represent local contexts in an input text as bags of words. Such representations are naturally encoded by a sequence graph whose vertices are the distinct words occurring in x, with edges representing the (ordered) co-occurrence of two words within a sliding window of size w. However, this compressed representation is not generally bijective, and may introduce some degree of ambiguity. Some sequence graphs may admit several realizations as a sequence, while …

abstract arxiv cs.cc cs.cl cs.ds graph graphs language language models popular text type words

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Principal Research Engineer - Materials

@ GKN Aerospace | Westlake, TX, US