Web: http://arxiv.org/abs/2205.05124

May 12, 2022, 1:11 a.m. | Nishant Subramani, Nivedita Suresh, Matthew E. Peters

cs.LG updates on arXiv.org arxiv.org

Prior work on controllable text generation has focused on learning how to
control language models through trainable decoding, smart-prompt design, or
fine-tuning based on a desired objective. We hypothesize that the information
needed to steer the model to generate a target sentence is already encoded
within the model. Accordingly, we explore a different approach altogether:
extracting latent vectors directly from pretrained language model decoders
without fine-tuning. Experiments show that there exist steering vectors, which,
when added to the hidden states …

arxiv language language models models

