Three smart cost-cutting techniques for data storage

30 May 2023

Adrian Moir, senior product management consultant & technology strategist, Quest Software

Adrian Moir, senior product management consultant & technology strategist, Quest Software

Most enterprises are adopting a cloud or multi-cloud strategy, and according to Gartner by 2025, 85% of infrastructure strategies will integrate on-premises, colocation, cloud, and edge delivery options. However, as inflation is affecting IT budgets, bringing cost increases in areas including servers, storage and professional services, companies are eager to cut costs.

Organisations are increasingly under pressure to back up and recover data quickly, accurately and within budget, and data growth in the cloud continues to make backup and recovery increasingly complex. Enterprise data is growing rapidly, with IDC predicting that the sum of the world’s data will grow to 175ZB by 2025. So, how can organisations address the natural impacts of data growth while meeting their backup and recovery demands?

1. Data deduplication

The process of data deduplication is based on finding and eliminating blocks of duplicated data. This technique can achieve significant storage savings well into the 90%+ range. Eliminating duplicated data is important because blocks of storage, whether in the cloud or on-premises, cost money. Deduplication involves examining data within files and data streams, storing only unique blocks of data that differ from anything already stored from any previous backup operations.

But blocks of storage are not the only resources that are sensitive to waste from redundant data. When blocks of redundant data are needlessly sent from devices to backup servers and storage, network paths become saturated at multiple points, with no corresponding improvement in data protection. Additionally, as companies count on having their applications and data available around the clock, any performance hit from backing up is unwelcome. That is why IT admins plan for a backup window in which the impact of the backup on system performance will be minimal — usually at night. Redundant data takes up precious time in that window.

While data deduplication can be processor-intensive; IT teams must consider each step in the process and decide what fits where:

  • When - backup timing
  • How - source side deduplication
  • Where - deduplication target deployment options


2. Data compression

Data compression is all about compressing the stored data blocks, importantly after any deduplication has occurred. This will significantly reduce the amount of data when sending it for example to the cloud. This reduces the amount of data moving between primary locations and cloud storage. Due to the processor-intensive nature involved, these technologies are often kept away from production workloads and performed on the data during backup when it is directed at a deduplication entity. To ensure efficiency, the deduplication solution must apply compression in an orderly fashion such that it does not impact on the initial backup, we don’t want to further impact the backup window.

The very nature of compression being processor-intensive should raise the question of whether you should compress at all. The short answer is yes, with the caveat that you want the benefit of compression if it is not slowing down your backup and recovery.

Deduplication and compression have the same goal: to reduce the quantity of backup data stored.

Deduplication uses algorithms to identify data that matches already stored data. If the algorithms find duplicated data, they replace it with a pointer to the already stored data. Then, they send only the unique data to be stored. Data compression algorithms, on the other hand, do not take already stored data into account. They ingest a specified file or set of files and compact them by replacing any repeated sequences with tokens and removing unneeded fillers and spaces in the data.

3. Storage tiering

Data deduplication and data compression are great for both on-premises and cloud sources and targets. Cloud storage, however, has another option you can take advantage of to reduce costs: storage tiering. Cloud storage providers offer multiple tiers of storage, usually sacrificing performance and speed of access for cost, with ‘colder’ tiers being less costly per GB but much slower. When it comes to storage tiering, the best way to optimise backup costs in the cloud is with an automated, policy-based approach. This will help you move retained backups over time to colder tiers of storage, reducing costs. To maximise the benefits of tiered storage, data should be moved between tiers based on an established policy determined by you. For example, six-month-old backups do not need to stay in more costly cloud storage.

Using all three data optimisation strategies to achieve the best results

If you are looking to dramatically reduce storage costs and accelerate backup and recovery, implementing all three data optimisation methods is the best approach. The trick is to find a strategy that seamlessly works with the existing backup solution to handle the compression, deduplication, and storage management. This approach will ensure the backup and recovery efforts are as efficient and accurate as possible. Investing in a third-party strategy will prevent the IT teams from having to add overhead to the backup solution, and it will quickly pay off, as you watch your storage costs decline.