Web: https://www.reddit.com/r/MachineLearning/comments/xhxpsl/d_using_special_tokens_for_a_domainspecific/

Sept. 19, 2022, 1:10 a.m. | /u/McAvagr

Machine Learning reddit.com

Hi everyone

I've recently dived into ViTs, and a thought crossed my mind that I was surprised to not find many papers exploring. Special tokens are pretty common in transformer architectures, but they usually play a background role, such as structural (like \[BEG\], \[END\], \[SEP\]) or a placeholder of sorts (\[CLS\], \[MASK\]). But I feel like self-attention allows for far more intricate constructs, and theoretically one can create a whole "mini-language" to somehow influence model's behaviour.

Is there a particular …

language machinelearning tokens transformers

