4

It's for a drive 200GB in size with only virtual machine images. Additional files may include a couple of xml files. The images themselves range from 5GB to 40GB in size.

Is 2MB a good idea or a terrible one? Logic dictates that large units mean less units to keep track of, but are there any performance hits I should be aware of?

I'm referring particularly to performance hits inside VMs, due to fragmentation when using smaller AUS. I was kind of hoping that a larger AUS would mean less seek time and a slightly increased performance.

I'm using Win10 and 2M is the max available AUS option.

Sorry if this question already exists. I couldn't find many that talk of units more than 64K, mine being in megabytes, I really needed to confirm before proceeding.

Glitch
  • 384
  • 2
  • 4
  • 17
  • 1
    Possible duplicate of [What "allocation unit size" should I use for a drive with a single NTFS partition?](https://superuser.com/questions/31682/what-allocation-unit-size-should-i-use-for-a-drive-with-a-single-ntfs-partitio) – Ramhound Feb 20 '19 at 20:01
  • Also see [Downsides of a small allocation unit size](https://superuser.com/questions/465615/downsides-of-a-small-allocation-unit-size) . But you've taken the opposite position to an extreme/excessive value! – sawdust Feb 20 '19 at 20:44
  • 1
    I imagine that the problem you'll see is when data is written within a VM, even small writes will require larger blocks. Sure, it's all part of a larger file, but I assume that RW are handled intelligently by the container, so while it might be good for large file copies and cloning at the host level, it'll increase writes for individual changes in each VM. Of course, this will all depend on the specific system in use and would require testing to validate. – shawn Feb 20 '19 at 22:51

2 Answers2

1

Higher cluster sizes mean that the $Cluster file in the MFT is smaller, and that less indexes are needed to track data across the volume.

This translates to an increase in disk space but since this isn't the mid-90's anymore, it's probably not worth it--at least not for a 200GB drive.

Regarding performance, it may be improved slightly or at least not affected if data tends to be accessed sequentially (like playing video/music) or in chunks around the size of the cluster.

This may not apply with VMs. If the vmdks are not also both at a 2M cluster size and aligned with the disks clusters, I would think random access might be hurt by large cluster sizes as you're asking NTFS to load 2M of data when just one 4096 or 512-byte block in that sector may be needed.

You're in a similar situation with newer "4K" format hard drives - they internally read/write in 4096-byte chunks, but still allow the OS to request 512-byte chunks. If your data is not 4096-byte aligned it will be double-reading on many requests. OSes align the data now so probably not something you need to think about with--but I am saying your situation above with VMs and changing cluster sizes could be creating a similar problem.

LawrenceC
  • 73,030
  • 15
  • 129
  • 214
  • 1
    I would be shocked if it made any difference on when talking about a VM, since the real performance hits, would primarily exist within the Host OS itself and the size of the MFT on the actual physical storage device. – Ramhound Feb 20 '19 at 20:41
0

Running benchmarks with different allocation size showed me empirically that 4k will give you the best performance. This is because the actual sector size on modern HDDs is 4k. I did the test on a 5TB drive using files that are 1GB min in size.

In general, if the cluster size exceeds the size of the IO, certain workflows can trigger unintended IOs to occur. Even if no extra IO occurs, you don't benefit from any speedup, since in the end it's translated to 4k final writes or read on the actual disk.

You might get less fragmentation over time, but I'm assuming your drive is an SSD given that it's only 200 GB, so fragmentation isn't really an issue.

  • I suppose 4k to 4k is much more preferable performance-wise. 200GB was the partition-size on the hard disk. But yes I am using an SSD now. – Glitch Mar 07 '21 at 14:23