8

I was reading Hadoop: The Definitive Guide and the following paragraph came up.

A disk has a block size, which is the minimum amount of data that it can read or write. Filesystems for a single disk build on this by dealing with data in blocks, which are an integral multiple of the disk block size. Filesystem blocks are typically a few kilobytes in size, whereas disk blocks are normally 512 bytes.

My understanding is disk block is limited by hardware (amount of data that can be read/ write from disk every time). Operating system creates abstraction called file system where it has it's own block size which is larger(multiple of) than disk block size. Similar to disk, operating system read/write data in size of file system block size. For a single read/write filesystem block multiple disk block operation will be performed. Is my understanding correct?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Farsan Rashid
  • 1,460
  • 4
  • 17
  • 28

2 Answers2

1

This depends on the hardware.

An SD device will typically re-write comparatively large amounts data even if you just want to change one bit. But can typically read smaller amounts of data in a single read. An SD may physically move the data during a write for "wear leveling" so it does not write the same place repeatedly and wear it out.

I don't think you can presume much about how much will be physically read or written based on block size for an HD, because the device has a controller that tries to optimize, using code that is not publicly available and is taking into consideration things like rotation speed, read head position, chip layout, known bad blocks etc.

HD blocks are really just smallest referenceable data block the device exposes. File System blocks are just the smallest referenceable block the FS code exposes.

In times gone by there might have been a direct relationship, but I would not presume that now.

teknopaul
  • 6,505
  • 2
  • 30
  • 24
1

You understanding is correct. But be aware in different contexts, block may refer to different things.

In general, for magnetic disk, a sector is the smallest unit of information that can be read or written. Section sizes are typically 512 bytes. As for SSD, the smallest unit is often called page, whose size is commonly 4096 bytes. Here, both section and page have physical senses, similar to disk block in your context.

However, disk block in some context may refer to the logical unit of storage allocation and retrieval used by file systems or database systems, and block sizes today typically range from 4 to 16 kilobytes. So, here disk block is identical to file block in your context.

chenzhongpu
  • 6,193
  • 8
  • 41
  • 79