all AI news
Topic: rlhf
[D] Impact of solar storm on QLORA + RLHF of Llama3 8B?
1 week, 1 day ago |
www.reddit.com
D2PO: Discriminator-Guided DPO with Response Evaluation Models
2 weeks, 3 days ago |
arxiv.org
MetaRM: Shifted Distributions Alignment via Meta-Learning
2 weeks, 4 days ago |
arxiv.org
A Survey of Reinforcement Learning from Human Feedback
2 weeks, 5 days ago |
arxiv.org
Contrastive Preference Learning: Learning from Human Feedback without RL
2 weeks, 5 days ago |
arxiv.org
High-Dimension Human Value Representation in Large Language Models
1 month, 1 week ago |
arxiv.org
Latent Distance Guided Alignment Training for Large Language Models
1 month, 1 week ago |
arxiv.org
Removing RLHF Protections in GPT-4 via Fine-Tuning
1 month, 1 week ago |
arxiv.org
[D] Does RLHF really work? why do you use it?
1 month, 2 weeks ago |
www.reddit.com
The 3 Best Alternatives to RLHF
1 month, 3 weeks ago |
www.youtube.com
NVIDIA NIM RAG Optimization: QuietSTAR (Stanford)
1 month, 4 weeks ago |
www.youtube.com
Items published with this topic over the last 90 days.
Latest
[D] Impact of solar storm on QLORA + RLHF of Llama3 8B?
1 week, 1 day ago |
www.reddit.com
D2PO: Discriminator-Guided DPO with Response Evaluation Models
2 weeks, 3 days ago |
arxiv.org
MetaRM: Shifted Distributions Alignment via Meta-Learning
2 weeks, 4 days ago |
arxiv.org
A Survey of Reinforcement Learning from Human Feedback
2 weeks, 5 days ago |
arxiv.org
Contrastive Preference Learning: Learning from Human Feedback without RL
2 weeks, 5 days ago |
arxiv.org
High-Dimension Human Value Representation in Large Language Models
1 month, 1 week ago |
arxiv.org
Latent Distance Guided Alignment Training for Large Language Models
1 month, 1 week ago |
arxiv.org
Removing RLHF Protections in GPT-4 via Fine-Tuning
1 month, 1 week ago |
arxiv.org
[D] Does RLHF really work? why do you use it?
1 month, 2 weeks ago |
www.reddit.com
The 3 Best Alternatives to RLHF
1 month, 3 weeks ago |
www.youtube.com
NVIDIA NIM RAG Optimization: QuietSTAR (Stanford)
1 month, 4 weeks ago |
www.youtube.com
Topic trend (last 90 days)
Top (last 7 days)
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US