all AI news
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models. (arXiv:2210.15523v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
Transformer-based pre-trained language models (PLMs) mostly suffer from
excessive overhead despite their advanced capacity. For resource-constrained
devices, there is an urgent need for a spatially and temporally efficient model
which retains the major capacity of PLMs. However, existing statically
compressed models are unaware of the diverse complexities between input
instances, potentially resulting in redundancy and inadequacy for simple and
complex inputs. Also, miniature models with early exiting encounter challenges
in the trade-off between making predictions and serving the deeper layers. …
arxiv collaborative cost efficiency exit language language models optimization temporal