all AI news
The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications. (arXiv:2207.04043v1 [cs.CL])
July 11, 2022, 1:10 a.m. | Mirac Suzgun, Luke Melas-Kyriazi, Suproteem K. Sarkar, Scott Duke Kominers, Stuart M. Shieber
cs.LG updates on arXiv.org arxiv.org
Innovation is a major driver of economic and social development, and
information about many kinds of innovation is embedded in semi-structured data
from patents and patent applications. Although the impact and novelty of
innovations expressed in patent data are difficult to measure through
traditional means, ML offers a promising set of techniques for evaluating
novelty, summarizing contributions, and embedding semantics. In this paper, we
introduce the Harvard USPTO Patent Dataset (HUPD), a large-scale,
well-structured, and multi-purpose corpus of English-language patent …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US