12

I want to know how does a file system write to and read from a storage device.

I think this is how it works:

A file system doesn't access the storage device directly, but rather the storage device is presented (by the device driver of the storage device) to the file system as a (very large) byte array.

For example, if the file system wants to access a hard disk, it will simply access the byte array representing the hard disk.

This way a file system can work with any type of storage device (traditional hard disk, SSD, USB flash drive, etc.), and only the device driver for the storage device is changed.

This image shows what I have just explained:

enter image description here

Am I correct in my understanding?

Tulains Córdova
  • 39,201
  • 12
  • 97
  • 154
joseph_m
  • 315
  • 1
  • 4

2 Answers2

15

On Linux (and 1980s era Unixes), a storage device (quite often a disk partition on some hard disk, or on some SSD) is a block device (see this) so is a [sub-]sequence of blocks (which is the basic unit of physical I/O). The physical block size depends on the hardware (old IDE disks had a block size of 512 bytes, new large SATA disks have a block size of 4Kbytes, read Advanced Format wikipage), and when you make a file system (with e.g. mkfs, see mke2fs(8)) you can specify a logical block size which is a multiple (often a small power of two, including 1) of that physical block size. Read also about logical block addressing.

In the past (think of Sun3 workstations of the 1990s) the disk was made of cylinders with heads organized in sectors (read CHS wikipage), with a sector containing a block. Today, these still remains but are a artificial artefact provided by the hard disk controller (the circuit on the disk itself). In some OSes the block device driver rescheduled and reordered IO requests to minimize disk head movement and rotational latency.

This way a file system can work with any type of storage device (traditional hard disk, SSD, USB flash drive, etc.), and only the device driver for the storage device is changed.

Yes, but the evil is in the details (e.g. read about TRIM and Write Amplification, specific to SSDs). And the details are important, so the actual implementation is less simple than your figure. Read more about file systems (and think of clustered & remote file systems including SMB & NFS; read also about Logical Volume Manager).

Read Operating Systems : Three Easy Pieces (and its persistence part).

Notice that block devices are gone in FreeBSD (actually providing a common abstraction for character and block devices). I suspect that even on Windows the OS knows about partitions, block size, etc (but you should check).

Basile Starynkevitch
  • 32,434
  • 6
  • 84
  • 125
  • Windows also handles partitions on disks, and each partition may use a different file system (perhaps FAT-32 on one and NTFS on another). The file system may access the partitions at the block level, or the blocks can be combined into "clusters". On any given partition, the cluster size would be fixed. – Simon B Jun 23 '17 at 12:10
  • I sort-of knew that, because MSDOS mostly did. – Basile Starynkevitch Jun 23 '17 at 12:26
  • Actually, FreeBSD still uses what you call "block devices". There is no way to access a hard disk other than as a large array of blocks. They are just called character/raw devices on FreeBSD. They still access the device as a large array of blocks. The only thing missing in FreeBSD is the caching at the device level, which isn't needed as the filesystem already provides caching. – juhist Jun 29 '17 at 19:21
3

Everything Basile Starynkevitch says is correct. I will add a bit more. Indeed disk drives were "block" drives, but block devices (and many other devices) were presented in two forms: "raw" and "cooked". Raw devices could be addressed only in chunks that were multiples of their native storage chunk size. So a raw disk device could only be read or written one or many blocks at a time, not just a byte or two. Cooked devices added a layer that would allow such smaller operations, as well as various other features.

File systems worked with raw devices, and thus saw them not as a big array of bytes, but rather a big array of blocks, as B.S. explained.

Topher
  • 41
  • 2