[P] Llama2 7B and 13B Chat Completion for Kaggle TPU v3-8 | allainews.com

March 16, 2024, 4:25 p.m. | /u/-x-Knight

Machine Learning www.reddit.com

Hi guys, I have made some modifications to the Llama2 repository to utilize the TPU v3-8 hardware, so it can perform Llama2 7B (and even 13B) chat completion inference without graph recompilation. It is still slower than the Nvidia P100 when generating text with batch-size 1, not suitable for real-time inference but (TPU being TPU) shines well with batched text generation. I used it to generate large amount of texts for research purpose. Hope it benefits the community.

Here's the …

13b batch-size chat graph hardware inference kaggle llama2 machinelearning nvidia text tpu

More from www.reddit.com / Machine Learning

[D] NVIDIA GPU Benchmarks & Comparison 4 hours ago | www.reddit.com

a100 ada cards cloud +15

[R] A Careful Examination of Large Language Model Performance on Grade School Arithmetic 5 hours ago | www.reddit.com

abstract benchmark benchmarks claim +21

[P] [D] Is inference time the important performance metric for ML Models on edge/mobile? 12 hours ago | www.reddit.com

apps devices edge embed +15

[D] Any-dimensional equivariant neural networks 14 hours ago | www.reddit.com

abstract assumptions authors cases +18

[D] Geometrical meaning of Layer Normalization 18 hours ago | www.reddit.com

hyperplane layer machinelearning mean +4

How are large network attack datasets made? [p] 18 hours ago | www.reddit.com

attacks datasets detection free +5

A Multi-Agent game where LLMs must trick each other as humans until one gets caught … 21 hours ago | www.reddit.com

agent fun game humans +7

[D] How reliable is RAG currently? 21 hours ago | www.reddit.com

context context window documents machinelearning +5

[N] New Challenges in DIAMBRA Arena: 3 epic additions to our lineup of RL environments! 21 hours ago | www.reddit.com

arena challenges environments epic +1

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net