Meet DeepSeek-Coder-V2 by DeepSeek AI: The First Open-Source AI Model to Surpass GPT4-Turbo in Coding and Math, Supporting 338 Languages and 128K Context Length | allainews.com

June 19, 2024, 1:01 a.m. | /u/ai-lover

machinelearningnews www.reddit.com

Researchers from DeepSeek AI introduced DeepSeek-Coder-V2, a new open-source code language model developed by DeepSeek-AI. Built upon the foundation of DeepSeek-V2, this model undergoes further pre-training with an additional 6 trillion tokens, enhancing its code and mathematical reasoning capabilities. DeepSeek-Coder-V2 aims to bridge the performance gap with closed-source models, offering an open-source alternative that delivers competitive results in various benchmarks.

DeepSeek-Coder-V2 employs a Mixture-of-Experts (MoE) framework, supporting 338 programming languages and extending the context from 16K to 128K tokens. The …

128k context ai model code coder coding context context length deepseek deepseek ai foundation gpt4 gpt4-turbo language language model languages machinelearningnews math open-source ai pre-training researchers tokens training turbo

More from www.reddit.com / machinelearningnews

GraphReader: A Graph-based AI Agent System Designed to Handle Long Texts by Structuring them into … 18 hours ago | www.reddit.com

agent alibaba alibaba group challenges +16

NYU Researchers Introduce Cambrian-1: Advancing Multimodal AI with Vision-Centric Large Language Models for Enhanced Real-World … 20 hours ago | www.reddit.com

benchmarks capabilities classification coco +24

EvolutionaryScale Introduces ESM3: A Frontier Multimodal Generative Language Model that Reasons Over the Sequence, Structure, … 1 day, 7 hours ago | www.reddit.com

advanced arc california create +17

Sohu Etched! 1 day, 14 hours ago | www.reddit.com

70b chip custom etched +10

Camb AI Releases MARS5 TTS: A Novel Open Source Text to Speech Model for Insane … 1 day, 17 hours ago | www.reddit.com

architecture audio auto camb ai +15

Create, edit, and augment tabular data with the first compound AI system, Gretel Navigator, now … 2 days, 6 hours ago | www.reddit.com

ai system augment compound ai create +7

NuMind Releases NuExtract: A Lightweight Text-to-JSON LLM Specialized for the Task of Structured Extraction 2 days, 7 hours ago | www.reddit.com

advancement alternative data data extraction +13

Alibaba Researchers Introduce AUTOIF: A New Scalable and Reliable AI Method for Automatically Generating Verifiable … 2 days, 19 hours ago | www.reddit.com

alibaba challenges check code +14

Researchers from the University of Maryland Introduce GenQA Instruction Dataset: Automating Large-Scale Instruction Dataset Generation … 4 days, 9 hours ago | www.reddit.com

academic academic research ai model ai models +25

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Hybrid Cloud Engineer

@ Vanguard | Wayne, PA

View on ai-jobs.net

Senior Software Engineer

@ F5 | San Jose

View on ai-jobs.net

Software Engineer, Backend, 3+ Years of Experience

@ Snap Inc. | Bellevue - 110 110th Ave NE

View on ai-jobs.net

Global Head of Commercial Data Foundations

@ Sanofi | Cambridge

View on ai-jobs.net