19 January 2022
Even the most stable and reliable storage systems still require some degree of maintenance. Many may believe that firmware updates, software updates and replacement of failed storage components can never be avoided. Really? This does not have to be so, here is a storage system example that has been running for four years without any need for maintenance or unexpected downtime.
Back in 2017, Toshiba installed a ZFS storage system supporting storage infrastructure needs of the “Technology Experience Lab” at NTT Global Data Centers. Since then, the storage system has proven its outstanding reliability: except for a scheduled 10 minutes of downtime to install some additional ZFS features, the system runs steadily without component failures in any of the 60 HDDs, SSD, controllers, power supplies, fans, cables or other components.
TECH LAB EXPLORING NEW ARCHITECTURES
The Technology Experience Lab provides data center managers and their teams with the room and infrastructure to explore innovative approaches for architecting their servers and storage. It enables businesses to evaluate the efficacy of private or hybrid clouds, distributed architectures, and alternative approaches to deliver IT services in a low-risk environment. In addition, its community of users provides support and ideas through meetups, boot camps, webinars, and hackathons.
In total, 102 TB of net storage were desired, requiring 240 TB of raw storage, utilising the ZFS-based JovianDSS software of Toshiba’s partner Open-E. This software had proven to offer high availability, no single point of failure, and high-flexibility, providing consistent snapshots and instant restore when required. The hardware to support it would need to be reliable and high-perfor- mance to take full advantages of the software, supporting several iSCSI block storage targets ranging in size from 10 TB to 40 TB, plus some shared file folders.
EXTENSIVE PLANNING WAS KEY FACTOR
The planning stage was the core for the long-term success of the final storage implementation. Toshiba often undertakes research into server implementation in its laboratories. It has resulted in close relationships with a broad range of component and software suppliers coupled with a deep understanding of what works well. Leveraging this knowledge, the team was able to recommend a hardware architecture that worked with Open-E JovianDSS and had proven itself in other projects.
RELIABILITY & OUTSTANDING PERFORMANCE
To form the 102 TB of net storage, the team selected Toshiba’s 4 TB, 3.5" SAS Enterprise Capacity Drives (MG04SCA40EA). With their MTTF (mean time to failure) of 1,400,000 hours and a non-recoverable error rate of just ten errors per 1016 bits read, they were ideal for achieving the reliability required. Performance was not ignored either. The 7,200 rpm drives achieved a Zpool read performance rating of 12.9 x single disk and 8.5 x single disk for writes. For the ZFS write logs and reach caches, reliable 10 DWPD SAS Enterprise SSDs with 1.6 TB storage from KIOXIA (formerly Toshiba Memory) were selected.
A significant factor in HDD failures is heat, so the HDD enclosure chassis had to be closely reviewed before selection. The team selected AIC’s J4060-01 Dual Expander, 12 GB/s, 60 bay toploader JBOD. The 1400 W 1+1 hot-swap redundant 80+ Platinum power supply ensured electrical efficiency while also fulfilling the reliability requirements. The JBOD also features four 80 x 38 mm hot-swap fans. System testing showed that the coolest and warmest drive temperature difference was just 4°C, confirming that the correct JBOD had been selected. Connectivity to the JBOD was provided by the Microchip Adaptec® RAID Adapter ASR-8885 with 8 internal and 8 external ports, run in HBA mode. This model was highlighted as a top choice back in 2017.