all AI news
Towards Generating Informative Textual Description for Neurons in Language Models. (arXiv:2401.16731v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
Recent developments in transformer-based language models have allowed them to
capture a wide variety of world knowledge that can be adapted to downstream
tasks with limited resources. However, what pieces of information are
understood in these models is unclear, and neuron-level contributions in
identifying them are largely unknown. Conventional approaches in neuron
explainability either depend on a finite set of pre-defined descriptors or
require manual annotations for training a secondary model that can then explain
the neurons of the primary …
arxiv cs.cl information knowledge language language models neuron neurons resources tasks textual them transformer world