Feb. 20, 2024, 8:34 a.m. | /u/Devinco001

Deep Learning www.reddit.com

I am having a ML based production application, using flask, deployed on GCP server using gunicorn workers. In each incoming request, a text sentence is received.

It is using sentence transformers (**All-MiniLM-L6-v2** model), which is loaded globally one time, to create embeddings of the incoming text and then use pre trained kmeans (also loaded globally) to predict/map it to a intent cluster. Basically, goal is to find intent of the sentence.

I have ample resources and the requests are also …

application cpu deeplearning embeddings flask gcp kmeans production server text transformers workers

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US