Eu estava lendo algo parecido com isso ontem sobre o OSX e sua compactação do sistema de arquivos - Basicamente a resposta gira em torno do que você deseja compactar - neste exemplo ele está falando sobre os dados "FAT"; estruturas de arquivos, propriedades, metadados etc que, quando armazenados juntos, podem ser compactados para economizar espaço e serem lidos na CPU mais rápido do que procurar a cabeça em todo lugar para encontrar os dados de cada arquivo ...
Enfim, vale a pena ler se você está pensando sobre essas coisas :-p
But compression isn't just about saving disk space. It's also a classic example of trading CPU cycles for decreased I/O latency and bandwidth. Over the past few decades, CPU performance has gotten better (and computing resources more plentiful—more on that later) at a much faster rate than disk performance has increased. Modern hard disk seek times and rotational delays are still measured in milliseconds. In one millisecond, a 2 GHz CPU goes through two million cycles. And then, of course, there's still the actual data transfer time to consider.
Granted, several levels of caching throughout the OS and hardware work mightily to hide these delays. But those bits have to come off the disk at some point to fill those caches. Compression means that fewer bits have to be transferred. Given the almost comical glut of CPU resources on a modern multi-core Mac under normal use, the total time needed to transfer a compressed payload from the disk and use the CPU to decompress its contents into memory will still usually be far less than the time it'd take to transfer the data in uncompressed form.
That explains the potential performance benefits of transferring less data, but the use of extended attributes to store file contents can actually make things faster, as well. It all has to do with data locality.
If there's one thing that slows down a hard disk more than transferring a large amount of data, it's moving its heads from one part of the disk to another. Every move means time for the head to start moving, then stop, then ensure that it's correctly positioned over the desired location, then wait for the spinning disk to put the desired bits beneath it. These are all real, physical, moving parts, and it's amazing that they do their dance as quickly and efficiently as they do, but physics has its limits. These motions are the real performance killers for rotational storage like hard disks.
The HFS+ volume format stores all its information about files—metadata—in two primary locations on disk: the Catalog File, which stores file dates, permissions, ownership, and a host of other things, and the Attributes File, which stores "named forks."
Extended attributes in HFS+ are implemented as named forks in the Attributes File. But unlike resource forks, which can be very large (up to the maximum file size supported by the file system), extended attributes in HFS+ are stored "inline" in the Attributes File. In practice, this means a limit of about 128 bytes per attribute. But it also means that the disk head doesn't need to take a trip to another part of the disk to get the actual data.
As you can imagine, the disk blocks that make up the Catalog and Attributes files are frequently accessed, and therefore more likely than most to be in a cache somewhere. All of this conspires to make the complete storage of a file, including both its metadata in its data, within the B-tree-structured Catalog and Attributes files an overall performance win. Even an eight-byte payload that balloons to 25 bytes is not a concern, as long as it's still less than the allocation block size for normal data storage, and as long as it all fits within a B-tree node in the Attributes File that the OS has to read in its entirety anyway.
There are other significant contributions to Snow Leopard's reduced disk footprint (e.g., the removal of unnecessary localizations and "designable.nib" files) but HFS+ compression is by far the most technically interesting.
De:
link