all AI news
WebLLM supports Llama 2 70B now
Simon Willison's Weblog simonwillison.net
WebLLM supports Llama 2 70B now
The WebLLM project from MLC uses WebGPU to run large language models entirely in the browser. They recently added support for Llama 2, including Llama 2 70B, the largest and most powerful model in that family.
To my astonishment, this worked! I used a M2 Mac with 64GB of RAM and Chrome Canary and it downloaded many GBs of data... but it worked, and spat out tokens at a slow but respectable rate of …
ai browser family generativeai language language models large language large language models llama llama 2 llms mac mlc project support the browser webassembly webgpu