all AI news
Inference for Regression with Variables Generated from Unstructured Data
Feb. 27, 2024, 5:45 a.m. | Laura Battaglia, Timothy Christensen, Stephen Hansen, Szymon Sacher
stat.ML updates on arXiv.org arxiv.org
Abstract: The leading strategy for analyzing unstructured data uses two steps. First, latent variables of economic interest are estimated with an upstream information retrieval model. Second, the estimates are treated as "data" in a downstream econometric model. We establish theoretical arguments for why this two-step strategy leads to biased inference in empirically plausible settings. More constructively, we propose a one-step strategy for valid inference that uses the upstream and downstream models jointly. The one-step strategy (i) …
abstract arxiv data econ.em economic generated inference information leads regression retrieval stat.ml strategy type unstructured unstructured data variables
More from arxiv.org / stat.ML updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Robotics Technician - 3rd Shift
@ GXO Logistics | Perris, CA, US, 92571