all AI news
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer. (arXiv:2401.16658v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
Recent studies have advocated for fully open foundation models to promote
transparency and open science. As an initial step, the Open Whisper-style
Speech Model (OWSM) reproduced OpenAI's Whisper using publicly available data
and open-source toolkits. With the aim of reproducing Whisper, the previous
OWSM v1 through v3 models were still based on Transformer, which might lead to
inferior performance compared to other state-of-the-art speech encoders. In
this work, we aim to improve the performance and efficiency of OWSM without
extra …
aim arxiv cs.cl data faster foundation openai open science promote science speech studies style through transparency whisper