Web: http://arxiv.org/abs/2205.05124

May 12, 2022, 1:11 a.m. | Nishant Subramani, Nivedita Suresh, Matthew E. Peters

cs.LG updates on arXiv.org arxiv.org

Prior work on controllable text generation has focused on learning how to
control language models through trainable decoding, smart-prompt design, or
fine-tuning based on a desired objective. We hypothesize that the information
needed to steer the model to generate a target sentence is already encoded
within the model. Accordingly, we explore a different approach altogether:
extracting latent vectors directly from pretrained language model decoders
without fine-tuning. Experiments show that there exist steering vectors, which,
when added to the hidden states …

arxiv language language models models

More from arxiv.org / cs.LG updates on arXiv.org

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote

Director of AI/ML Engineering

@ Armis Industries | Remote (US only), St. Louis, California

Digital Analytics Manager

@ Patagonia | Ventura, California