Traditional systems weren’t built to handle the parallel data demands of modern AI, creating bottlenecks and delays. Cloudian, co-founded by MIT alumni, is solving this with a purpose-built AI data storage platform that enables high-speed, scalable access between storage and GPUs — transforming how data powers AI innovation.
The AI Data Storage Challenge in the Era of Large-Scale Models
Artificial intelligence is changing the way businesses store and access their data. That’s because traditional data storage systems were designed to handle simple commands from a handful of users at once, whereas today, AI systems with millions of agents need to continuously access and process large amounts of data in parallel. Traditional data storage systems now have layers of complexity, which slows AI systems down because data must pass through multiple tiers before reaching the graphical processing units (GPUs) that are the brain cells of AI.
Cloudian’s Breakthrough in Scalable AI Data Storage
Cloudian, co-founded by Michael Tso ’93, SM ’93 and Hiroshi Ohta, is helping storage keep up with the AI revolution. The company has developed a scalable storage system for businesses that helps data flow seamlessly between storage and AI models. The system reduces complexity by applying parallel computing to data storage, consolidating AI functions and data onto a single parallel-processing platform that stores, retrieves, and processes scalable datasets, with direct, high-speed transfers between storage and GPUs and CPUs.
Cloudian’s integrated storage-computing platform simplifies the process of building commercial-scale AI tools and gives businesses a storage foundation that can keep up with the rise of AI.
From MIT Research to Industry Transformation
“One of the things people miss about AI is that it’s all about the data,” Tso says. “You can’t get a 10 percent improvement in AI performance with 10 percent more data or even 10 times more data — you need 1,000 times more data. Being able to store that data in a way that’s easy to manage, and in such a way that you can embed computations into it so you can run operations while the data is coming in without moving the data - that’s where this industry is going.”
As an undergraduate at MIT in the 1990s, Tso was introduced to parallel computing by Professor William Dally. As a graduate student, he worked with senior research scientist David Clark on large-scale distributed systems. Today, Tso sees strong parallels between his early academic work and Cloudian's approach to scalable, high-speed AI data storage.
How Cloudian’s Object Storage Powers AI at the Edge
Cloudian’s platform uses an object storage architecture in which all kinds of data —documents, videos, sensor data — are stored as a unique object with metadata. Object storage can manage massive datasets in a flat file structure, making it ideal for unstructured data and AI systems. However, it traditionally couldn’t send data directly to AI models without being copied into memory first.
In July, Cloudian announced it has extended its object storage system with a vector database. This stores data in a form that is immediately usable by AI models, reducing latency and energy bottlenecks. Cloudian also partnered with NVIDIA to allow direct compatibility with their GPUs, enabling real-time, high-speed AI processing.
Enabling AI-First Architectures Across Industries
Cloudian is already helping around 1,000 companies across various sectors including automotive, healthcare, government, and finance. Their platform supports large-scale AI use cases such as predictive maintenance in manufacturing, DNA sequence analysis in cancer research, and secure medical data archiving.
Cloudian’s AI data storage solutions are future-proofed for next-generation AI applications. As GPUs continue to break Moore’s Law through parallel processing, Cloudian’s architecture ensures these high-performance processors stay fully utilised by removing traditional data bottlenecks.
Future-Proofing AI Data Storage with Parallel Processing
“The only way to make GPUs work hard is to feed them data at the same speed that they compute,” Tso says. “And the only way to do that is to get rid of all the layers between them and your data.”
By embedding AI functions within storage and enabling parallel compute operations during data ingestion, Cloudian is building an ecosystem where AI data storage doesn’t just keep up with innovation - it powers it.