Feb. 20, 2024, 8:34 a.m. | /u/Devinco001

Deep Learning www.reddit.com

I am having a ML based production application, using flask, deployed on GCP server using gunicorn workers. In each incoming request, a text sentence is received.

It is using sentence transformers (**All-MiniLM-L6-v2** model), which is loaded globally one time, to create embeddings of the incoming text and then use pre trained kmeans (also loaded globally) to predict/map it to a intent cluster. Basically, goal is to find intent of the sentence.

I have ample resources and the requests are also …

application cpu deeplearning embeddings flask gcp kmeans production server text transformers workers

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Machine Learning Engineer

@ Samsara | Canada - Remote