[D] NER for large text data | allainews.com

May 6, 2024, 10:06 a.m. | /u/Boring_Astronaut_421

Machine Learning www.reddit.com

Hello people
I am currently working as a data scientist at startup. We have a requirement of extracting entities from the text of 10 billion tokens. I am not aware how to do it at this much scale. What should be the pipeline and so on. It would be helpful if you guys share your knowledge or good research paper/blog. Currently we are working on 18 entities and my boss wants me to get 93% accuracy.
Thankyou

billion data data scientist hello machinelearning ner people pipeline scale startup text tokens

More from www.reddit.com / Machine Learning

[D] Mamba Convergence speed 12 hours ago | www.reddit.com

class convergence dataset example +10

[Project] Tabletop HandyBot: low-cost robotic arm assistant for tabletop tasks 16 hours ago | www.reddit.com

arm assistant cost functional +9

[R] Grounding DINO 1.5 Release: the most capable open-set detection model 16 hours ago | www.reddit.com

building dataset detection foundation +12

[D] Foundational Time Series Models Overrated? 16 hours ago | www.reddit.com

chronos domain etc example +13

[project] YOLOv8 quantized in INT8 17 hours ago | www.reddit.com

fps github jetson jetson orin +5

[R] Do Llamas Work in English? On the Latent Language of Multilingual Transformers 17 hours ago | www.reddit.com

abstract bias colab english +19

[R] Robust agents learn causal world models 17 hours ago | www.reddit.com

abstract agent agents biases +14

[D] Library for named entity recognition 18 hours ago | www.reddit.com

library machinelearning mean recognition +3

[N] ICML 2024 Workshop on making discrete operations differentiable 🤖 19 hours ago | www.reddit.com

clustering deep learning differentiable everything +12

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net