Oct. 16, 2023, 2:11 p.m. | Silvia Onofrei

Towards Data Science - Medium towardsdatascience.com

Designed by Freepik

Combine TSDAE pre-training on a target domain with supervised fine-tuning on a general-purpose corpus to enhance the quality of the embeddings for a specialized domain.

Introduction

Embeddings encode text into high dimensional vector spaces, using dense vectors to represent words and to capture their semantic relationships. Recent developments in generative AI and LLM, such as context search and RAG rely heavily on the quality of their underlying embeddings. While the similarity searches use basic mathematical concepts such …

domain adaptation fine-tuning-transformer sentence-embedding transformers

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Research Scientist

@ Meta | Menlo Park, CA

Principal Data Scientist

@ Mastercard | O'Fallon, Missouri (Main Campus)