Feb. 14, 2024, 1:43 p.m. | /u/ivan_kudryavtsev

Computer Vision www.reddit.com

We created a benchmark comparing the serving of YOLOV8M model (640x640, bs=1) with three different approaches:

* PyTorch CUDA + OpenCV;
* PyTorch CUDA + Torchaudio (hardware decoding with NVDEC);
* Savant (TensorRT, hardware decoding with NVDEC).

Savant demonstrated threefold performance versus naive PyTorch CUDA + OpenCV and more than twofold versus PyTorch CUDA + Torchaudio. The numbers are for GeForce RTX 2080 and Intel Core i5-8600K CPU @ 3.60GHz / 32 GB RAM.

|**Benchmark**|**FPS**|**Improvement**|
|:-|:-|:-|
|Pytorch CUDA + OpenCV|75|0.294| …

benchmark computervision cuda decoding faster hardware opencv performance pytorch tensorrt yolov8

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv