all AI news
Distributed Out-of-Memory NMF on CPU/GPU Architectures. (arXiv:2202.09518v3 [cs.DC] UPDATED)
cs.LG updates on arXiv.org arxiv.org
We propose an efficient distributed out-of-memory implementation of the
Non-negative Matrix Factorization (NMF) algorithm for heterogeneous
high-performance-computing (HPC) systems. The proposed implementation is based
on prior work on NMFk, which can perform automatic model selection and extract
latent variables and patterns from data. In this work, we extend NMFk by adding
support for dense and sparse matrix operation on multi-node, multi-GPU systems.
The resulting algorithm is optimized for out-of-memory (OOM) problems where the
memory required to factorize a given matrix …
algorithm architectures arxiv computing cpu data distributed extract factorization gpu hpc implementation matrix memory model selection negative patterns performance prior systems variables work