June 30, 2022, 1:11 a.m. | Pu Wang, Hugo Van hamme

cs.CL updates on arXiv.org arxiv.org

End-to-end spoken language understanding (SLU) systems benefit from
pretraining on large corpora, followed by fine-tuning on application-specific
data. The resulting models are too large for on-edge applications. For
instance, BERT-based systems contain over 110M parameters. Observing the model
is overparameterized, we propose lean transformer structure where the dimension
of the attention mechanism is automatically reduced using group sparsity. We
propose a variant where the learned attention subspace is transferred to an
attention bottleneck layer. In a low-resource setting and without …

arxiv language spoken language understanding transformers understanding

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US