Blog

Sep 29

Best Practices for VM Fleet Performance Testing

This document will cover what VM Fleet is, how to set it up, and how to use VM Fleet for performance benchmarking.

VM Fleet is used to analysis performance work by allowing analyst to do real time changes to simulate real word situation. It utilizes a set of scripts found in GitHub that utilizes diskspd for testing and validation of performance on HCI clusters. Diskspd is a free and open-source tool used for storage benchmarking with Microsoft Windows Server environments. VM Fleet was introduced with Windows Sever 2016 TP2 in August 2015.

The prerequisites to use VM Fleet are:

  1. Make sure you have the bare minimum requirements for an HCI cluster such as the cluster, storage volume, and domain set up for each node.
  2. Create a new cluster share volume for the VM Fleet files, result, and VHDX.

    [code language=”powershell”]New-Volume -StoragePoolFriendlyName S2D* -FriendlyName collect -FileSystem CSVFS_ReFS -Size 50GB -ResiliencySettingName "Mirror" -PhysicalDiskRedundancy 1[/code]

    Remember to adjust the physical disk redundancy from 1 or 2 depending on your type of resiliency setting (i.e. two-way mirror or three-way mirror).

  3. Download the necessary script for VM Fleet found on GitHub:
    https://github.com/microsoft/diskspd/tree/master/Frameworks/VMFleet

The fastest way to get all the scripts is to clone or download as a .zip file and extract once downloaded.

5. Create the VHDX by creating a new VM and installing an operating system on it. You can use Windows Server 2016 or 2019, but it is important that you install the core version. Set your Administrator password and shutdown the VM. There is no need to sysprep since you only need the VHDX file and the VM can be delete soon after.

Once all the necessary files are obtained, you are ready to install VM Fleet.

Since the scripts are not digitally signed, you need to allow Powershell to run it.

[code language=”powershell”] Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process [/code]

Install VM Fleet by going to the folder and run install-vmfleet.ps1. The example below shows the ‘vmfleetws19\control folder’ where we store all the VM Fleet scripts downloaded from GibHub.

[code language=”powershell”]cd ‘C:\VMfleetws19\control’

.\install-vmfleet.ps1 -source .\[/code]

Copy the VHDX to C:\ClusterStorage\Collect and delete the VM once complete.

[code language=”powershell”]copy-item ‘C:\VMfleetws19\Gold.vhdx’ C:\ClusterStorage\collect[/code]

Create a tool folder and put the diskspd application into the folder.

C:\ClusterStorage\collect\control\tools

For micro benchmarking purpose, we are setting our CSV cache to 0. Since we are not using a real workload, the in-memory cache is not effective and will cause us overhead.

[code language=”powershell”](Get-Cluster).Blockcachesize = 0[/code]

For immediate changes, pause and resume your CSV volume or move them between servers. Use the Powershell command below:

[code language=”powershell”]Get-ClusterSharedVolume | ForEach {

$Owner = $_.OwnerNode

$_ | Move-ClusterSharedVolume

$_ | Move-ClusterSharedVolume -Node $Owner}[/code]

Create the VM based on the number of cores on the system, using the variables below. Otherwise you can skip this step if you want to set your own number of VMs. The number of VMs you can have cannot exceed the amount of logical processor. Below we created our VM base on how many physical cores we have.

[code language=”powershell”]$t = ((gwmi win32_computersystem).NumberOfLogicalProcessors /2)[/code]

Now you are ready to create your VM for testing. You must make sure that your -adminpass is the local admin password of the VM image. The -connectuser and -connectpass are the credentials used to create a loopback connection to host. The process may take some time depending on the amount of VM and the relative performance of the system.

[code language=”powershell”].\create-vmfleet.ps1 -basevhd C:\ClusterStorage\collect\Gold.vhdx -vms $t -adminpass ‘DataOn20’ -connectuser ‘Anahiem1247’ -connectpass ‘Tomburger24!'[/code]

The number of VMs created with the command is equal to the number of physical processors. You can change the number of VM created by changing “$t” to any number that is not more than logical processors you have.

Once you finish creating the VMs, you are ready to set the hardware specifications for the VMs. In the example below we are setting each VM to have one virtual processor with 2GB of memory.

[code language=”powershell”].\set-vmfleet.ps1 -ProcessorCount 1 -MemoryStartupBytes 2gb -MemoryMaximumBytes 2gb -MemoryMinimumBytes 2gb[/code]

Remember, you cannot set more hardware specifications for each of your VMs than what is in your cluster. For example, if your cluster has 32 virtual processors and you have 32 VMs, you can have each VM with one processor. If you set each VM to 4 processors each, then you would have insufficent resourses and some VMs would not be able to start. This applies to memory as well.

Starting VM Fleet

You are all set to have some fun with VM fleet. First run the performance counter to view the I/O output of all the VMs. The performance counter can show you a lot of information such as the I/O of each node, read or write latencty, etc.

[code language=”powershell”]start-process powershell.exe -argument ‘C:\ClusterStorage\collect\control\watch-cluster.ps1 -sets *'[/code]

If you have issues with the font size in the performance counter, you can open the ‘watch-cluster.ps1’ Powershell script in an administrator Powershell. This should reduce the font size so can easily view all your performance counters.

Start VM Fleet by starting up all the VMs in the cluster. You can start all VMs either by selecting all the VMs in Failover Manager and click the ‘Start’ button or run a simple Powershell script cmdlet:

[code language=”powershell”].\start-vmfleet.ps1[/code]

Now you are almost ready to run some tests but first let’s review the parameters:

  • B: Buffer size (KiB)
  • T: Thread counts
  • O: Outstanding I/O counts
  • W: Write ratios
  • P: Patterns (r = random, s = sequential, si = sequential interelocked)
  • Warm: Duration of pre-measured intervals (seconds)
  • D: Duration of mearured intervals (seconds)
  • Cool: Duration of post-measurement cooldown (seconds)

Now that we understand the parameters, we can set up for testing. To start, let’s run a simple sweep test for 100 percent read performance:

[code language=”powershell”].\start-sweep.ps1 -b 4 -t 8 -o 8 -w 0 -d 3600 -p r[/code]

With this sweep test, based on 20 virtual machines, we get an I/O performance number in the millions spread across 3 nodes. As you see, each node produces about 600,000 I/O with a read latency under 1 millisecond.

When running VM Fleet, keep in mind the performance numbers will not always be constant. They will always be changing but you will see the range of I/O from its lowest to its highest point. Each of your node clusters will be within the same range of each other and if your numbers are extremely off you will know that you have some issue(s) to resolve. For example, if rose-n2 were to have I/O around 400,000 while the rest are around 600,00 you could have bottlenecks in your network hardware or even metadata on your drive. VM Fleet can help you determine if there an issue to resolve before you put your cluster into production.

As you play around with the parameters, you can set it up to mimic real workloads. For example, you can run multiple sweeps in one by setting the write parameter with a comma. So you can perform a 100 percent read test for the duration of 6 minutes. After 6 minutes, it will change to a 100 percent write and then it will change again for every 6-minute interval.

[code language=”powershell”].\start-sweep.ps1 -b 4 -t 1 -o 8 -w 0,100,30,50 -d 360 -p r[/code]

Once you’re done with VM Fleet, you can stop the sweep with:

[code language=”powershell”].\stop-vmfleet.ps1[/code]

If you’re not done with VM Fleet and wish to start another sweep, you can pause the task:

[code language=”powershell”].\set-pause.ps1[/code]

VM Fleet is a powerful tool to use to determine what kind of performance you should be expecting out of your cluster. Remember that the performance numbers reflect raw performance, not what to expect for every application you run. As always, it best to stress test your cluster before putting it into production.