all AI news
Extracting Latent Steering Vectors from Pretrained Language Models. (arXiv:2205.05124v1 [cs.CL])
Web: http://arxiv.org/abs/2205.05124
May 12, 2022, 1:11 a.m. | Nishant Subramani, Nivedita Suresh, Matthew E. Peters
cs.LG updates on arXiv.org arxiv.org
Prior work on controllable text generation has focused on learning how to
control language models through trainable decoding, smart-prompt design, or
fine-tuning based on a desired objective. We hypothesize that the information
needed to steer the model to generate a target sentence is already encoded
within the model. Accordingly, we explore a different approach altogether:
extracting latent vectors directly from pretrained language model decoders
without fine-tuning. Experiments show that there exist steering vectors, which,
when added to the hidden states …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Director, Applied Mathematics & Computational Research Division
@ Lawrence Berkeley National Lab | Berkeley, Ca
Business Data Analyst
@ MainStreet Family Care | Birmingham, AL
Assistant/Associate Professor of the Practice in Business Analytics
@ Georgetown University McDonough School of Business | Washington DC
Senior Data Science Writer
@ NannyML | Remote
Director of AI/ML Engineering
@ Armis Industries | Remote (US only), St. Louis, California
Digital Analytics Manager
@ Patagonia | Ventura, California