Sept. 18, 2023, 3:11 a.m. | Francesco Gadaleta

Data Science at Home datascienceathome.podbean.com

As a continuation of Episode 238, I explain some effective and fun attacks to conduct against LLMs. Such attacks are even more effective on models served locally, that are hardly controlled by human feedback.
Have great fun and learn them responsibly.
 
References
https://www.jailbreakchat.com/
https://www.reddit.com/r/ChatGPT/comments/10tevu1/new_jailbreak_proudly_unveiling_the_tried_and/
https://arxiv.org/abs/2305.13860

attacks feedback fun human human feedback learn llms profit them

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Risk Management - Machine Learning and Model Delivery Services, Product Associate - Senior Associate-

@ JPMorgan Chase & Co. | Wilmington, DE, United States

Senior ML Engineer (Speech/ASR)

@ ObserveAI | Bengaluru