May 20, 2022, 3:42 p.m. | /u/iblis3

Machine Learning www.reddit.com

I'm trying to extract relevant text from error logs (example below) then classifying by error reason. I've been doing (but mostly failing) it using the following process:

1. Regex to filter out text between brackets and removing dates, times, IPs, filepaths, etc
2. Remove stop words and keep only real words
3. Put the resulting string of words through a sentence transformer (all-mpnet-base-v2 in this case) to get embeddings
4. Reduce dimensionality and cluster
5. Within clusters, use the most …

error extract logs machinelearning text

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Machine Learning Operations (MLOps) Engineer - Advisor

@ Peraton | Fort Lewis, WA, United States

Mid +/Senior Data Engineer (AWS/GCP)

@ Capco | Poland

Senior Software Engineer (ETL and Azure Databricks)|| RR/463/2024 || 4 - 7 Years

@ Emids | Bengaluru, India

Senior Data Scientist (H/F)

@ Business & Decision | Toulouse, France

Senior Analytics Engineer

@ Algolia | Paris, France