05 June 2024
Paul Scott-Murphy, CTO, Cirata
Disaster recovery and business continuity must function now in a far more risk-laden and complex environment. While ransomware threats continue to plague organisations, those executives responsible for ensuring their company meets compliance and business continuity goals face several added challenges.
CIOs and their teams are refining their business continuity strategies to address the magnitude of unstructured data. They are seeking practices that are capable of efficiently managing data analytics fed by large language models (LLM) while providing the replication safety net that supports business continuity.
The GenAI juggernaut
Huawei estimates global data volume will reach 180 zettabytes by 2025, 80% of which will be unstructured data; and by 2030, some 25% of unstructured data will be used for production and decision making, eventually reaching 80%. GenAI is driving this volume by aggregating text, voice, documents, videos, emails and messaging platforms.
While GenAI is beginning to exhibit value in various use cases, its rapid development, broad use of large data sets, potential for misapplication and lack of robust security controls make it a rich target for cyber criminals. As Gartner says, ‘enterprises must prepare for malicious actors’ use of generative AI systems for cyber and fraud attacks, such as those that use deep fakes for social engineering of personnel and ensure mitigating controls are put in place.’
Protecting large data sets
Executives are not yet confident that GenAI can help protect against data breaches or business continuity, even while it contributes to the explosive growth of unstructured data and more possibilities for cyber attacks. As businesses continue to assemble large data sets using GenAI-powered processes, and transform this data into applications, there is an immediate need to ensure these data sets are protected and adhere to business continuity standards.
“CIOs and their teams are refining their business continuity strategies to address the magnitude of unstructured data. They are seeking practices that are capable of efficiently managing data analytics fed by large language models (LLM) while providing the replication safety net that supports business continuity.”
Contributing to the security challenges is the fact that different functional teams are using and generating unstructured data in their own fashion and with varying degrees of security in place. IDC warns that application sprawl and fragmentation of unstructured data, ‘often with diverse sets of identity and authentication models and different administrative features,’ contribute to more potential attack surfaces. Analysing the cost of a data breach, IDC estimates that greater fragmentation leads to doubling of annual costs of security breaches, $4.5 million versus $2.2 million.
To avoid fragmentation, businesses must set enterprise-wide standards on such practices as authentication and policy controls, and work with functional teams to accommodate specific needs like recovery point in time standards. Like any other use of data, unstructured data use must adhere to security compliance standards. IT security teams must also conduct regular audits to ensure standards and policies are being followed.
Integrating GenAI into business continuity
Large language models offer the potential of significant business value but still must be subject to the same security and data protection practices as any other application or data asset. To ensure recovery and business continuity, there are several immediate considerations:
Visibility is a priority. The principle that you can’t manage assets you don’t know about holds true for unstructured data. Functional teams, or lines of businesses, must have visibility into the unstructured data in their environment. It is a fundamental practice to avoid cyber attacks, data privacy breaches and budget impact. By having visibility, teams can make a judgment as to which GenAI data and/or models are critical and need to be categorised as such to support continuity.
The cloud is king. Large language models are built on data that needs to live in the cloud, requiring best practice in being able to securely store the data and execute recovery as needed. The expense and lack of hardware support for datasets needed to train large language models makes on premise storage highly prohibitive. If a business has adopted a multi-cloud strategy, it needs to consider a solution that can support large data set migration across multiple cloud providers.
Recovery is the point. GenAI has changed the amount of data flowing through an organisation and to the cloud. In refining a business continuity strategy, to integrate GenAI, functional teams need to review their recovery time objectives (RTO) and recovery point objectives (RPO). These set standards which will ensure they have backup and recovery processes in place to accommodate recovery of any critical GenAI large data sets or applications.
Replication is imperative. To support near-zero RTO and RPO objectives, replication technology can help enable compliance and fast data recovery by providing real-time cloud replication of actively used GenAI data. This method reduces costs and further ensures continuous data accuracy should recovery be necessary.
No downtime is a must. Any data movement can hamper business continuity if it requires application downtime. Data migration solutions that can facilitate large-scale data changes and migration to the cloud will help minimise disruption.
Automation is the answer. In the event of a system failure, IT teams can use active-active replication over multi-cloud environments, as necessary, to ensure automatic failover and recovery, minimising data loss and downtime.
Practice secure GenAI use
Now is the time to review business continuity practices.
Updating security practices will help avoid the costs attributed to a data breach – both direct monetary costs, and reputational costs. Stringent compliance requirements regarding data protection, privacy, and continuity of operations are an important cost and trust factor. It is too easy to introduce confidential data into training large language models, perhaps unwittingly.
“Updating security practices will help avoid the costs attributed to a data breach – both direct monetary costs, and reputational costs. Stringent compliance requirements regarding data protection, privacy, and continuity of operations are an important cost and trust factor. It is too easy to introduce confidential data into training large language models, perhaps unwittingly.”
Security and functional teams will need to work together to set limits on unstructured data that pose a privacy threat. IT teams must also avoid policy controls fragmentation, update recovery practices, and use replication technology to take the lead in ensuring business continuity.



