1

I'm trying to run an Ethereum full node (geth v1.10) with archiving for big data analysis. The storage requirements are expected to be 8TB after being fully sync'd.

I have a Synology NAS that is serving 8TB LUN via iScsi (can max around ~120 MB/s throughput) My HyperV windows server (2019 v1809) has a 1TB locally attached NVMe drive. I combined the 1 TB NVME with 8TB iscsi using storage spaces (NOT storage spaces direct) w/tiering, and without mirroring or any data redundancy. With the two disks (iscsi 8TB HDD + 1TB NVME) in a tiered storage pool, I created a 8TB Virtual Disk and used that in the linux VM.

My goal was that with tiered storage NVME would accelerate I/O for relevant recent hot data, while the 8TB HDD is capacity and colder data.

When looking at the windows performance monitor I noticed my SSDTier doesn't seem to be read or written to? Also, the Storage Spaces write cache is 1GB with >90% read\write bypass. Not sure what exactly the cache is (SSD tier? or some other internal cache?)

Internal test using dd seem to show around 380MB/s, but I assume cache is somehow being hit.

user1@ether:/eth1/bin$ dd if=/dev/zero of=/eth1/temp oflag=direct bs=128k count=32k
32768+0 records in
32768+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.3336 s, 379 MB/s

Same test but using urandom

user1@ether:/eth1$ dd if=/dev/urandom of=/eth1/temp oflag=direct bs=128k count=32k
32768+0 records in
32768+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 64.8097 s, 66.3 MB/s

Cache not being used?

Below is the script I ran to setup the storage pool, storage tiers and create the virtual drive on hyperv host

    # RUN AS ADMINISTRATOR
    # https://nils.schimmelmann.us/post/153541254987/intel-smart-response-technology-vs-windows-10
    #Tested with one SSD and two HDD
    #
    #Pool that will suck in all drives
    $StoragePoolName = "ETH-Data-Pool"
    #Tiers in the storage pool
    $SSDTierName = "SSDTier"
    $HDDTierName = "HDDTier"
    #Virtual Disk Name made up of disks in both tiers
    $TieredDiskName = "ETH-Data-Disk"
    
    #Simple = striped.  Mirror only works if both can mirror AFIK
    #https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-R2-and-2012/dn387076(v=ws.11)
    $DriveTierResiliency = "Simple"
    
    #Change to suit - drive later and the label name
    $TieredDriveLetter = "K"
    $TieredDriveLabel = "Data_Tiered"
    
    #Override the default sizing here - useful if have two different size SSDs or HDDs - set to smallest of pair
    #These must be Equal or smaller than the disk size available in that tier SSD and HDD
    #SSD:cache  -    HDD:data
    #set to null so copy/paste to command prompt doesn't have previous run values
    $SSDTierSize = $null
    $HDDTierSize = $null
    #Drives cannot always be fully allocated - probably broken for drives < 10GB
    $UsableSpace = 0.99
    
    #Uncomment and put your HDD type here if it shows up as unspecified with "Get-PhysicalDisk -CanPool $True
    #    If your HDDs show up as Unspecified instead of HDD
    $UseUnspecifiedDriveIsHDD = "Yes"
    
    #List all disks that can be pooled and output in table format (format-table)
    Get-PhysicalDisk -CanPool $True | ft FriendlyName, OperationalStatus, Size, MediaType
    
    #Store all physical disks that can be pooled into a variable, $PhysicalDisks
    #    This assumes you want all raw / unpartitioned disks to end up in your pool - 
    #    Add a clause like the example with your drive name to stop that drive from being included
    #    Example  " | Where FriendlyName -NE "ATA LITEONIT LCS-256"
    if ($UseUnspecifiedDriveIsHDD -ne $null){
        $DisksToChange = (Get-PhysicalDisk -CanPool $True | where MediaType -eq Unspecified)
        Get-PhysicalDisk -CanPool $True | where MediaType -eq Unspecified | Set-PhysicalDisk -MediaType HDD
        # show the type changed
        Get-PhysicalDisk -CanPool $True | ft FriendlyName, OperationalStatus, Size, MediaType
    }
    $PhysicalDisks = (Get-PhysicalDisk -CanPool $True | Where MediaType -NE UnSpecified)
    if ($PhysicalDisks -eq $null){
        throw "Abort! No physical Disks available"
    }       
    
    #Create a new Storage Pool using the disks in variable $PhysicalDisks with a name of My Storage Pool
    $SubSysName = (Get-StorageSubSystem).FriendlyName
    New-StoragePool -PhysicalDisks $PhysicalDisks -StorageSubSystemFriendlyName $SubSysName -FriendlyName $StoragePoolName
    #View the disks in the Storage Pool just created
    Get-StoragePool -FriendlyName $StoragePoolName | Get-PhysicalDisk | Select FriendlyName, MediaType
    
    #Set the number of columns used for each resiliency - This setting assumes you have at least 2-SSD and 2-HDD
    # Get-StoragePool $StoragePoolName | Set-ResiliencySetting -Name Simple -NumberOfColumnsDefault 2
    # Get-StoragePool $StoragePoolName | Set-ResiliencySetting -Name Mirror -NumberOfColumnsDefault 1
    
    #Create two tiers in the Storage Pool created. One for SSD disks and one for HDD disks
    $SSDTier = New-StorageTier -StoragePoolFriendlyName $StoragePoolName -FriendlyName $SSDTierName -MediaType SSD
    $HDDTier = New-StorageTier -StoragePoolFriendlyName $StoragePoolName -FriendlyName $HDDTierName -MediaType HDD
    
    #Calculate tier sizes within this storage pool
    #Can override by setting sizes at top
    if ($SSDTierSize -eq $null){
        $SSDTierSize = (Get-StorageTierSupportedSize -FriendlyName $SSDTierName -ResiliencySettingName $DriveTierResiliency).TierSizeMax
        $SSDTierSize = [int64]($SSDTierSize * $UsableSpace)
    }
    if ($HDDTierSize -eq $null){
        $HDDTierSize = (Get-StorageTierSupportedSize -FriendlyName $HDDTierName -ResiliencySettingName $DriveTierResiliency).TierSizeMax 
        $HDDTierSize = [int64]($HDDTierSize * $UsableSpace)
    }
    Write-Output "TierSizes: ( $SSDTierSize , $HDDTierSize )"
    
    # you can end up with different number of columns in SSD - Ex: With Simple 1SSD and 2HDD could end up with SSD-1Col, HDD-2Col
    New-VirtualDisk -StoragePoolFriendlyName $StoragePoolName -FriendlyName $TieredDiskName -StorageTiers @($SSDTier, $HDDTier) -StorageTierSizes @($SSDTierSize, $HDDTierSize) -ResiliencySettingName $DriveTierResiliency -AutoWriteCacheSize -AutoNumberOfColumns
    
    # initialize the disk, format and mount as a single volume
    Write-Output "preparing volume"
    Get-VirtualDisk $TieredDiskName | Get-Disk | Initialize-Disk -PartitionStyle GPT
    # This will be Partition 2.  Storage pool metadata is in Partition 1
    Get-VirtualDisk $TieredDiskName | Get-Disk | New-Partition -DriveLetter $TieredDriveLetter -UseMaximumSize
    Initialize-Volume -DriveLetter $TieredDriveLetter -FileSystem NTFS -Confirm:$false -NewFileSystemLabel $TieredDriveLabel
    Get-Volume -DriveLetter $TieredDriveLetter
    
    Write-Output "Operation complete"
Hipster Cat
  • 131
  • 5

1 Answers1

2

The issue is related to NTFS formatted Virtual Disk. I've run some tests and with NTFS tiered Storage Spaces doesn't use hot tier on WS2019. You should format it with ReFS or use Windows Server 2016.

Stuka
  • 5,445
  • 14
  • 13
  • Interesting, is that a documented issue or design? I also tried just presenting the raw tiered virtual disk ("Physical hard disk") to the VM and formatting it inside the VM as ext4, but had the same effect – TheUniquePaulSmith May 16 '21 at 14:39
  • 2
    I haven't found any documents related to this issue. Found it during the tests. – Stuka May 17 '21 at 09:24