In this work, we propose several attention formulations for multivariate
sequence data. We build on top of the recently introduced 2D-Attention and
reformulate the attention learning methodology by quantifying the relevance of
feature/temporal dimensions through latent spaces based on self-attention
rather than learning them directly. In addition, we propose a joint
feature-temporal attention mechanism that learns a joint 2D attention mask
highlighting relevant information without treating feature and temporal
representations independently. The proposed approaches can be used in various
architectures …

