Aug. 8, 2023, 12:20 p.m. | /u/___Daybreak___

Natural Language Processing www.reddit.com

I'm wondering what the academic community's thoughts on the Linguistic Data Consortium. They seem to provide a wide range of language resources for different languages, but they're all covered via paywall. From an outsider perspective (I'm an industry researcher btw), it runs contrary to the usual open science approach I see right now (e.g., uploading datasets on Hugging Face or on Github).

I also haven't seen a lot of work that cites LDC datasets (? unsure), perhaps aside from ACE2004/2005, …

academic community data industry language language resources languages languagetechnology open science perspective researcher resources science

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Software Engineer, Generative AI (C++)

@ SoundHound Inc. | Toronto, Canada