all AI news
Building an LLM fine-tuning Dataset
March 6, 2024, 7:01 p.m. | sentdex
sentdex www.youtube.com
NVIDIA GTC signup: https://nvda.ws/3XTqlB6
Fine-tuning code: https://github.com/Sentdex/LLM-Finetuning
5000-step Walls1337bot adapter: https://huggingface.co/Sentdex/Walls1337bot-Llama2-7B-003.005.5000
WSB Dataset: https://huggingface.co/datasets/Sentdex/WSB-003.005
"I have every reddit comment" original reddit post and torrent info: https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
2007-2015 Reddit Archive.org: https://archive.org/download/2015_reddit_comments_corpus/reddit_data/
Reddit BigQuery 2007-2019 (this has other data besides reddit comments too!): https://reddit.com/r/bigquery/comments/3cej2b/17_billion_reddit_comments_loaded_on_bigquery/
Contents:
0:00 - Introduction to Dataset building for fine-tuning.
02:53 - The Reddit dataset options (Torrent, Archive.org, BigQuery)
06:07 - Exporting BigQuery Reddit (and some …
archives bigquery building contents data dataset fine-tuning introduction language language model llm qlora reddit through
More from www.youtube.com / sentdex
Building an LLM fine-tuning Dataset
1 month, 4 weeks ago |
www.youtube.com
Visualizing Neural Network Internals
2 months, 3 weeks ago |
www.youtube.com
Getting Back on Grid
2 months, 4 weeks ago |
www.youtube.com
Open Source AI Inference API w/ Together
4 months, 1 week ago |
www.youtube.com
INFINITE Inference Power for AI
4 months, 2 weeks ago |
www.youtube.com
Pandas Dataframes on your GPU w/ CuDF
5 months, 3 weeks ago |
www.youtube.com
QLoRA is all you need (Fast and lightweight model fine-tuning)
7 months, 2 weeks ago |
www.youtube.com
Chat Interface for your Local Llama LLMs
8 months, 1 week ago |
www.youtube.com
Gzip is all You Need! (This SHOULD NOT work)
9 months, 1 week ago |
www.youtube.com
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Codec Avatars Research Engineer
@ Meta | Pittsburgh, PA