Building the trusted backbone for the AI era

09 December 2025

AI workloads will be a tough test of enterprise infrastructure performance and security rigor. The proliferation of AI, from large language models (LLMs) to specialised deep learning applications, is putting unprecedented strain on enterprise infrastructure. This is not a future challenge but a present one. AI workloads are not merely heavier versions of traditional applications; they are fundamentally different, requiring an immediate redesign of network capacity, security segmentation and operational visibility.

The network backbone: requirements of AI traffic

AI workloads are extremely compute- and data-intensive, placing uncompromising demands on network performance. Any shortfall leads to GPU starvation, driving up training costs and delaying outcomes. For example, conventional data centers, often running on 25 Gigabit Ethernet (GbE), are instantly obsolete for serious AI training.

AI workloads differ fundamentally from traditional enterprise applications in several ways. First, training large models involves terabytes or petabytes of data movement between data lakes, compute clusters, and GPUs. Second, inference workloads are elastic, creating sudden traffic surges as applications scale to meet real-time demand. Third, AI workloads, by nature distributed, increasingly span on-prem, edge and multiple clouds. This means traditional networks optimised for north-south traffic and predictable client-server flows struggle under the east-west, multi-directional patterns introduced by AI. Leaders must prioritise a clear path to high-speed fabrics. This should include quality of service (QoS) rules to protect the lower-volume, user-facing inference traffic from the crushing demands of training cycles.

Secure multi-cloud and edge connectivity

Modern AI ecosystems are multi-cloud by design – training in one cloud, inferencing at the edge, and storage on-prem. This diversification enhances performance and resilience but amplifies security complexity. To avoid the pitfalls of a fragmented security and networking environment, leaders need to adopt the following design principles:

Unified fabric: Deploy SD-WAN or SASE architectures that enforce consistent policies across AWS, Azure, GCP, and private clouds.
End-to-end segmentation: Extend identity-based controls across cloud-native service meshes and virtual networks.
Encrypted overlays: Use secure tunnels (IPSec or QUIC) for inter-cloud AI data exchange, with throughput and anomalies monitored.
Edge integration: Connect inference nodes securely to cloud models via lightweight, encrypted links, ensuring zero data exposure at the edge

AI-ready networking must make data mobility secure by default while maintaining deterministic performance across heterogeneous environments.

Operational readiness: deep observability

An AI application's complexity, especially in a distributed environment, makes traditional monitoring tools inadequate. Observability for AI is more than just measuring bandwidth; it requires context. Leaders need to move beyond simple network flow monitoring to understand application-layer behaviours. AI observability will be built on three pillars:

Data path visibility: Monitor packet loss, latency and throughput in GPU and storage networks.
Inference traffic analytics: Identify patterns that deviate from expected AI service behaviour, such as abnormal model call frequencies or payload sizes.
Model-aware telemetry: Integrate observability with MLOps pipelines to correlate network conditions with model accuracy or drift.

Next-generation NDR (network detection and response) platforms and observability tools must integrate AI workload context, such as model identifiers, tenant labels and training jobs if they are to gain real insights rather than raw metrics.

The new perimeter: segmentation and data protection

AI models and their training datasets are among the most valuable intellectual assets an organisation owns. Ensuring their confidentiality without compromising performance is paramount. AI introduces new attack surfaces, particularly in multi-tenant or shared GPU environments. Leaders should adopt the following strategies to reduce risk:

Segment AI workloads by function – training, inference and storage – and apply identity-aware access policies. Isolate development, testing and production down to the container or pod level to contain incidents. Use software-defined overlays to separate GPU tenants and manage east-west traffic. Extend segmentation consistently across clouds with centrally enforced policies.
Encrypt all data – especially training datasets – both at rest and in transit using strong TLS/SSL. Tag and control data lineage to ensure sensitive flows stay within authorised zones. Use hardware-accelerated or inline re-encryption solutions to preserve ultra-low latency during training. Encryption must remain transparent to operations while maintaining strong confidentiality, performance, and energy efficiency.

Conclusion

AI is redefining not just how organisations process data but how their networks and security systems must evolve. From handling vast east-west traffic to enforcing zero-trust across multi-cloud environments, every layer of the enterprise stack is being rewritten for intelligence, scalability and trust. Networking and security leaders who modernise their architectures through segmentation, encryption, observability and automation will not only secure AI but also enable it. Resilience and intelligence begin at the network level in the era of AI.