Feb. 4, 2024, 12:22 p.m. | /u/louisbrulenaudet

Machine Learning www.reddit.com

# A new PyPI package for training sentence embedding models in just 2 lines.

The acquisition of sentence embeddings often necessitates a substantial volume of labeled data. However, in many cases and fields, labeled data is rarely accessible, and the procurement of such data is costly. In this project, we employ an unsupervised process grounded in pre-trained Transformers-based Sequential Denoising Auto-Encoder (TSDAE), introduced by the Ubiquitous Knowledge Processing Lab of Darmstadt, which can realize a performance level reaching 93.1% of …

acquisition autoencoder cases data denoising embedding embedding models embeddings fields machinelearning package pre-training procurement project pypi training transformers unsupervised

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Machine Learning Research Scientist

@ d-Matrix | San Diego, Ca