2

I'm usually working on software and am weak on storage. I'm reading about block vs file vs object storage and everything I find talks about storage as an entirely isolated subject rather than including what the operating system would see - which is what I'm really interested it.

What confuses me is the idea that block and file storage are completely separate. For example, in LVM, you have to give it block devices to manage and then on top of that you have the respective VGs and LVs with a file system on top of that. Are these block devices not considered block storage?

When a storage person says file storage, doesn't that file storage at some point necessarily have block storage underneath it? Conversely, would block storage not require something on top of it to be useful? Either a file system or some sort of object/proprietary/special sauce storage?

For example, if I understand correctly, a SAN has some sort of controller which then presents LUNs (typically) which then get mounted by servers (seen as another hard drive in the OS) which then put file systems on them. Or is that incorrect?

It just seems odd because most of the reading material presents these as seemingly mutually exclusive options and I'm not sure if I am misunderstanding something fundamental.

Grant Curell
  • 1,043
  • 6
  • 19

2 Answers2

2

Simplifying:

  • block storage advertises blocks (LBA) and read/write block commands. An appliance exporting block storage means that another machine can access it similar to a local disk and make a filesystem on top of it. The client machine accesses raw blocks. For example, it can do something as mkfs.xfs /dev/sdX where sdX really is an iSCSI device;

  • file storage exports directories (NFS) or shares (SMB). A client machine sees the remote storage as a network-attached directory, putting/retrieving files from it. Read/writes are not against raw block, but against files. A client machine can access files, but not "format" the directory/share itself;

  • object storage exports "object", or binary blob of data which can be read/written with a simplified GET/PUT API - think about a webserver accepting GET/PUT requests for object which are retrieved/stored on underlying traditional (file or block based) devices. This is basically what Amazon S3 is doing, exporting a bucket you can read or append to.

What makes the distinction more complicated is that you can layer each storage type on top of the other, or emulate one via another. For example, using file storage one can create a filesystem inside a file and mount the resulting filesystem inside the guest. Or one can use a GET/POST interface to simulate a block or file storage.

That said, the underlying storage generally is block based (even for the few HDDs supporting object storage, GET/POST are internally translated to LBA or other block addressing scheme).

shodanshok
  • 47,711
  • 7
  • 111
  • 180
1

I am not a storage professional but i will try to answer your question.

Block storage is storage to which you can write and read data in blocks, hence the name, traditionally it used to be 512 Bytes sectors of hard drive, now modern hard drives can be 4096 Bytes sectors (filesystems block sizes can be bigger up to 64Kb for xfs for example)

Block storage can be partitioned into different partitions, that's operating system layer where you want to curve your block storage(1 hard drive into different logical pieces). One for boot, one for games, one for data, one for work. That is optional though, as you mentioned above LVM you can just add block storage unit as Physical volume without partitioning it.

File storage is the notion of a filesystem on that block device all it is a convenient interface to arrange and save/read data to that device, also it allows to write specific blocks like MBR for example to let BIOS api find how to boot operating system, but besides this it's just logical way for kernel of operating system to interact and arrange data on your block device at the same time for users to have nice file-based hierarchy on that storage(directories in unix, folders in windows)

Now you mentioned LVM that is just a program which allows you to manage your storage in a more efficient way(add, remove, mirror, snapshot logical disks) it's just an abstraction layer of particular operating system to interact with underlying hardware storage devices in a simpler way for a user, similar as zfs volumes or Solaris volume manager or linux software RAID(mdraid), all these were created for convenience.

Object storage is storage similar to file storage, it's just stored differently and has metadata indexes and some other mechanics to distribute the data within different hardware and store data(git, AWS S3).

I hope this helps.

Danila Ladner
  • 5,331
  • 22
  • 31