April 14, 2024, 12:14 p.m. | /u/henrythepaw

Machine Learning www.reddit.com

Does anyone know what the best models/benchmarks are for semantic code similarity search? In particular I'd like to find a model that is trained primarily for c/c++ code, such that the embeddings could be used to compare how similar a particular function is to other another function, and also to compare similarity with natural language descriptions. I came across Code T5, which looks quite good. But it isn't clear how much c/c++ code was used during training (it seems like …

art benchmarks code current embedding embedding models embeddings function machinelearning search semantic state state of the art

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

AI Engineering Manager

@ M47 Labs | Barcelona, Catalunya [Cataluña], Spain