06 October 2025

Mark Klarzynski, Co-Founder & Chief Strategy Officer, PEAK:AIO
For decades, data architecture has been built around a simple assumption: storage is where data lives, and compute is where the work gets done. But as AI scales, this traditional separation is becoming a major obstacle. Networks are overloaded, power consumption is surging, and latency often leads to missed opportunities.
To move forward, we need more than small improvements. Smarter caching layer or better tiering may buy time, but they’re not sustainable solutions for the long run. What’s required is a fundamental shift in how we think about data and computation, and not just where data is stored, but where computation actually happens.
Let’s pause on that. “Stop bringing the data to the job. Start bringing the job to the data.”
Imagine a common enterprise scenario: a simple SQL search across a multi-petabyte dataset. Traditionally, storage sends the data across the network to a compute server, which caches it, then processes the query. However, today’s datasets are too large, and the costs of bandwidth, time, and energy are too high.
As businesses today need to make decisions in real time, any friction in the data pipeline can create issues and pose risks. What if we flipped the model and the storage system didn’t just serve data, but actually processed it?
This isn’t a novelty, but a strategic rethinking. By enabling computation directly within the storage platform, organisations can significantly reduce latency, network usage, and infrastructure costs, all while meeting the demands of AI at scale.
Compute at the edge of storage
This concept has implications far beyond SQL. Consider AI inference pipelines, real-time analytics, or machine learning workloads where response time matters. Many of these tasks are highly parallel and read-intensive, making them ideal candidates to run closer to the data.
Allowing architectures to place containerised inference models directly on the storage node enables operations such as vector similarity searches, metadata filters, or preprocessing steps to occur directly where the data resides, before a single byte is transmitted over the network. This shift not only improves performance but also saves on cost by reducing data fees, network congestion, and energy usage.
When implemented properly, this doesn’t just reduce latency, but it reshapes what’s possible.
Intelligence isn’t just in the workload. It’s in the movement.
While this rethink of job placement is transformative, it’s only half of the equation. For this model to reach its full potential, data must already be where it needs to be, when it’s needed.
Modern architectures are now evolving to enable multi-tiered, intelligent data management that goes far beyond traditional tiering. This means seamlessly integrating diverse storage types, such as high-performance NVMe, QLC flash, archive-class media and even cloud storage, into a unified, transparent system.
However, it’s not about just unifying layers. Instead of relying solely on static policies or predefined rules, advanced systems now use AI-driven engines to monitor real-time usage, access frequency, and workload behaviour. Data is then automatically moved or replicated based on performance needs, relevance, or projected demand, without introducing latency or disrupting users.
As a result, the right data is placed in the right location at the right time, often before it’s even requested.
From edge to exascale
This intelligent, active approach to data isn’t limited to large-scale cloud environments. In fact, it’s often most valuable in smaller power or space-constrained settings, where power, space, and bandwidth are limited, and latency can’t be masked by overprovisioning.
Whether at the edge, in healthcare deployments, or within large-scale research institutions, the principles remain the same: storage must be efficient, autonomous, and aware of the data it serves.
By combining highly efficient storage infrastructure with intelligent, workload-aware data management, organisations can extend the benefits of AI-driven architectures from edge devices to exascale systems.
Time to rethink storage’s role in AI
As AI workloads grow, the underlying infrastructure philosophy must evolve too. Simply adding more GPUs or scaling flash storage is no longer a sustainable strategy, either financially or environmentally.
The real opportunity lies in a systemic rethink: building infrastructure that is not just faster, but smarter, making storage an active part of the AI pipeline, rather than a bottleneck to work around.
By combining intelligent data movement, in-place job execution, and dynamic tiering into one cohesive system, infrastructure can become more autonomous, efficient, and responsive to the growing demands of workloads.
This isn’t just about keeping up, but about reimagining what infrastructure can be with AI.