IT Hero: Running Azure Stack HCI on a DataON Integrated System with All-NVMe Flash

Charbel Nemnom is a Microsoft MVP & MCT, Swiss Certified ICT Security Expert, and Certified Cloud Security Professional (CCSP). He works as a Cloud and Security Architect for itnetX, a cloud service provider and consulting company in Switzerland.

In Windows Server 2016, Microsoft introduced a new type of virtualized storage called Storage Spaces Direct which is now part of Azure Stack HCI. Azure Stack HCI enables you to build highly available storage systems with locally attached disks, and without the need to have any external SAS fabric such as shared JBODs or enclosures. This was the first true software-defined storage (SDS) from Microsoft, which involves storing data without dedicated hardware.

In Windows Server 2019Microsoft added a lot of improvements to Azure Stack HCI (formerly known as Windows Server Software-Defined, a.k.a WSSD).

Fast forward to 2020, Microsoft introduced a new operating system dedicated to the hyper-converged deployment model where the innovation continues at a faster cadence compared to Windows Server. Azure Stack HCI was re-introduced as a new hyper-converged infrastructure (HCI) operating system delivered as an Azure service that provides the latest security, performance, and feature updates.

In this article:

DataON AZS-216 Integrated System for Azure Stack HCI

Introduction

With Azure Stack HCI, you can deploy and run Windows Server and Linux virtual machines (VMs) in your datacenter or at the edge using your existing tools, processes, and skillsets. Additionally, you can extend the datacenter to the cloud with Azure BackupSite RecoveryAzure File SyncAzure MonitorAzure ARC, and Azure Security Center.

On September 1st, 2021, Microsoft announced the GA release of Windows Server 2022 with no major improvement for Storage Spaces Direct. As noted earlier, all future innovations will go into Azure Stack HCI to run hyper-converged infrastructure; however, Windows Server will continue to benefit from improvements to existing features. Windows Server 2022 still lacks advanced features such as stretched clusters but has been given a new repair option for Storage Spaces Direct, called ‘adjustable storage repair speed’. System admins can use this to control how many resources they want to allocate for repairing data copies or active workloads.

I recently did a 3-node Azure Stack HCI hyper-converged deployment on top of a DataON AZS-216 Integrated System with all-NVMe flash and hit over 2.5 million IOPS.

A DataON AZS-216 Integrated System hit over 2.5 million IOPS

In this article, I would like to share with you my experience and performance results.

3-Node DataON Integrated System

For this deployment, I used the following hardware configuration:

  • DataON AZS-216 Integrated System with Azure Stack HCI OS
  • Supports dual Intel® Xeon® Scalable Gen 2 Processor series & 24x DDR4 DIMMs
  • Drive bay: 16x NVMe U.2 2.5″ hot-swappable
  • PCIe slot: 7x PCIe 3.0 x8
  • Onboard NIC: 2x built-in 10GbE RJ45
  • 1300W (1+1) 110V hot-swappable redundant PSU with NEMA 5-15 power cords
  • Intel® Remote Management Module 4
  • Intel® Xeon® Scalable Gen 2 Gold 5218R 2.1GHz, 20-core, 27.5MB cache
  • 384GB (12x 32GB) Samsung® DDR4 2933MHz ECC-Register RDIMM
  • 2x Intel® S4510 480GB SATA M.2 boot drive for OS
  • 10x Intel® DC P5510 NVMe 3.8TB 2.5″ 144L 3D TLC SSDs
  • 2x NVIDIA/Mellanox® ConnectX-4 Lx EN dual port SFP+ 10/25GbE RDMA cards
  • 2x NVIDIA/Mellanox® LinkX™ passive copper cables, ETH, up to 25Gb/s, SFP28, 30 AWG
  • 2x NVIDIA/Mellanox® Spectrum™ 18-port 10/25GbE X 4-port 100GbE switches (RDMA/RoCEv2)

DataON AZS-216 Integrated Systems for Azure Stack HCI are pre-configured nodes with certified components, tested and validated by DataON and Microsoft to help build Azure Stack HCI clusters with ease.

In this configuration, all NVMe disks are used as capacity (all-flash) as shown in the inventory below.

DataON AZS-216 Integrated System for Azure Stack HCI | Drives Inventory

Resiliency

The cluster shared volume is configured with three-way mirror to support the maximum resiliency in one site. With three-way mirror, you can sustain two failures at the same time, and your workloads remain online.

You could test the following four different scenarios:

  1. Physical drive pull
  2. Reboot a node (observe failover)
  3. Physical power pull of a node
  4. Shut down one node and pull a single drive from one of the remaining nodes that are still up.

Software Configuration

  • Host: Azure Stack HCI OS, version 20H2 (OS build 17784.1884)
  • Single storage pool (117 TB)
  • 3x 10.3TB (three-way mirror)
  • ReFS/CSVF file system
  • 60 Virtual machines (20 VMs per node)
  • 2 Virtual processors and 8GB RAM per VM
  • VM: Windows Server 2019Datacenter Core Edition with August 2021 update
  • Jumbo frame enabled
  • CSV cache is disabled for benchmarking purposes only. For real-world workloads, CSV cache is enabled with 16GB

Workload Configuration

DISKSPD version 2.0.21a workload generator

VM Fleet workload orchestrator

Test 1 – Random 4K, 8 Threads, 8 Outstanding I/O, 100% Read

Total 2.5M IOPS – Read/Write Latency @ 0.1/0.6 (ms)

Each VM is configured with:

  • 4K I/O size
  • 10GB working set
  • 100% read and 0% write
  • No storage QoS
  • RDMA enabled RoCE v2

4Kb Block size, 8 Threads, 8 Outstanding I/O (100% Read)

Please note that the 100% READ output is a bit skewed since the reads are all local. However, having the same number of threads on any workload that involved writes would drastically increase the latency and reduce the number of IOPS as shown in the subsequent tests.

Test 2 – Random 4K, 4 Threads, 8 Outstanding I/O, 100% Write

Total 460K IOPS – Read/Write Latency @ 0.02/2.5 (ms)

Each VM is configured with:

  • 4K I/O size
  • 10GB working set
  • 0% read and 100% write
  • No storage QoS
  • RDMA enabled RoCE v2

4Kb Block size, 4 Threads, 8 Outstanding I/O (100% Write)

Test 3 – Random 4K, 4 Threads, 8 Outstanding I/O, 70% Read / 30% Write

Total 1M IOPS – Read/Write Latency @ 0.01/0.4 (ms)

Each VM is configured with:

  • 4K I/O size
  • 10GB working set
  • 70% read and 30% write
  • No storage QoS
  • RDMA enabled RoCE v2

4Kb Block size, 4 Threads, 8 Outstanding I/O, (30% Write / 70% Read)

Test 4 – Random 4K, 4 Threads, 8 Outstanding I/O, 50% Read / 50% Write

Total 785K IOPS – Read/Write Latency @ 0.1/0.7 (ms)

Each VM is configured with:

  • 4K I/O size
  • 10GB working set
  • 50% read and 50% write
  • No storage QoS
  • RDMA enabled RoCE v2

4Kb Block size, 4 Threads, 8 Outstanding I/O, (50% Write / 50% Read)

Test 5 – Sequential 512K, 1 Thread, 1 Outstanding I/O, 100% Read

Total 72K IOPS – Read/Write Latency @ 0.7/0.3 (ms)

Each VM is configured with:

  • 512K I/O size
  • 10GB working set
  • 100% read and 0% write
  • No storage QoS
  • RDMA enabled RoCE v2

512Kb Block size, 1 Threads, 1 Outstanding I/O, (100% Read)

Test 6 – Sequential 512K, 1 Thread, 1 Outstanding I/O, 100% Write

Total 17K IOPS – Read/Write Latency @ 0.00/3.3 ms

Each VM is configured with:

  • 512K I/O size
  • 10GB working set
  • 0% read and 100% write
  • No storage QoS
  • RDMA enabled RoCE v2

DataON MUST and Windows Admin Center Integration

DataON MUST is a hybrid-cloud infrastructure monitoring and management tool. It’s designed to seamlessly integrate with Windows Admin Center through a single pane of glass that consolidate all aspect of local, remote server, cluster and Azure Stack HCI monitoring and management

DataON MUST integration with Windows Admin Center

The second integration is with DataON MUST Pro which integrates with Windows Admin Center’s cluster creation and cluster-aware updating (CAU) functionality to simplify deployment and updates to Microsoft Azure Stack HCI, with minimal disruptions to your infrastructure.

DataON MUST Pro automatically compares your DataON Integrated System for Azure Stack HCI against DataON’s latest quarterly validated server component image baseline. It also ensures that servers have the same OS version, drivers, firmware, BIOS, and BMC, and checks the drivers and firmware for network cards, host bus adapters, and SSD and HDD drives.

Summary

In this article, I shared my experience and showed you the performance results with three-way mirror resiliency on a 3-node DataON AZS-216 Integrated System. For more information about Azure Stack HCI, please check the Microsoft documentation here.

Always remember that storage is cheap, but downtime is expensive!!!

This article was originally posted on CharbelNemnom.com and is reposted with permission from Charbel Nemnom.