March 1, 2024, 4:19 p.m. | /u/nihalnayak

Machine Learning www.reddit.com

Excited to share our work on synthetic task generation.

Introducing Bonito 🐟, an open-source model that converts your raw, unannotated data into synthetic instruction tuning datasets. With it, you can easily create a specialized LLM for your proprietary and private data!

Check out our work below:
Paper: [https://arxiv.org/abs/2402.18334](https://arxiv.org/abs/2402.18334)
Code: [https://github.com/BatsResearch/bonito](https://github.com/BatsResearch/bonito)
Model: [https://huggingface.co/BatsResearch/bonito-v1](https://huggingface.co/BatsResearch/bonito-v1)

check data datasets generate llm machinelearning private data proprietary raw synthetic work zero-shot

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

GN SONG MT Market Research Data Analyst 11

@ Accenture | Bengaluru, BDC7A