March 22, 2024, 4:43 a.m. | Romeo Kienzler, Leonardo Pondian Tizzei, Benedikt Blumenstiel, Zoltan Arnold Nagy, S. Karthik Mukkavilli, Johannes Schmude, Marcus Freitag, Michael Be

cs.LG updates on arXiv.org arxiv.org

arXiv:2309.02094v3 Announce Type: replace
Abstract: Storing and streaming high dimensional data for foundation model training became a critical requirement with the rise of foundation models beyond natural language. In this paper we introduce TensorBank, a petabyte scale tensor lakehouse capable of streaming tensors from Cloud Object Store (COS) to GPU memory at wire speed based on complex relational queries. We use Hierarchical Statistical Indices (HSI) for query acceleration. Our architecture allows to directly address tensors on block level using HTTP …

abstract arxiv beyond cloud cs.ai cs.db cs.ir cs.lg data foundation foundation model gpu lakehouse language memory natural natural language object object store paper scale store streaming tensor training type

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US