The promise of network block storage is wonderful: Take a familiar abstraction (the disk), sprinkle on some magic cloud pixie dust so that it’s completely reliable, available over the same cheap network you’re using for app traffic, map it to any instance in a datacenter regardless of network topology, make it so cheap it’s practically free, and voila, we can have our cake and eat it too! It’s the holy grail many a storage vendor, most of whom with decades experience in storage systems and engineering teams thousands strong have chased for a long, long time. The disk that never dies. The disk that’s not a disk.
The reality, however, is that the disk has never been a great abstraction, and the long history of crappy implementations has meant that many behavioral workarounds have found their way far up the stack. The best case scenario is that a disk device breaks and it’s immediately catastrophic taking your entire operating system with it. Failure modes go downhill from there. Networks have their own set of special failure modes too. When you combine the two, and that disk you depend on is sitting on the far side of the network from where your operating system is, you get a combinatorial explosion of complexity.
Fascinating piece on the perils of disk abstraction. Raises a very good question: Why do we worry about disks at all in the cloud? I wonder how many folks would just be tossing data into the cloud without the comfy metaphor of disk and machine to lean on?