Parece que existem duas abordagens gerais (pelo menos nos mundos de MooseFS e XtreemFS):
O drive de cada vez
For MooseFS the best way is to use one HDD as one XFS partition connected to chunkserver. We don't recommending to use any RAID, LVM configuration.
Why?
First thing are HDD errors. If your hard drive starts to slowing down, is hard to find with one it is on LVM. On MFS you can find it very quickly even from MFS master web site. Second thing: adding or removing hard drive to MooseFS is easier than adding or removing to LVM group. Just add HDD to chunkserver, format to XFS, reload chunkserver and you have extra space on your instance. Third thing is that MoooseFS have many better sorting algorithms for placing chunks to many hard disks, So all hard drives have balanced traffic - LVM doesn't
A abordagem de volume por vez
The XtreemFS OSD (and also the other services) rely on a local file system for data and metadata storage. Thus, on a machine with multiple disks, you have two possibilities. First, you can combine multiple disks on one machine to a single file system, e.g. by using RAID, LVM, or a ZFS pool. Second, each disk (including SSDs, etc.) hold its own local file system and is exported by an own XtreemFS OSD service.
Both of the possibilities have their advantages and disadvantages and I cannot make a general recommendation. The first option brings flexibility in terms of the used RAID level or possibly attached SSD caches. Furthermore it might be easier to maintain and monitor one OSD process per machine than one process per disk.
Using one OSD server per local disk might result in a better performance. While running a RAID of fast SSDs, the XtreemFS OSD might become a bottleneck. You could also share the load of multiple OSD on one machine over multiple network interfaces. For replicated files, you have to care about replica placement and avoid placing multiple replicas of one file on OSDs running on the same hardware. You possibly have to write a custom OSD selection policy. XtreemFS offers an interface for this.
Qual parece melhor
Com base na resposta do XtreemFS, parece que o MooseFS poderia se beneficiar da abordagem de volume por vez, mas apenas se você mitigar muito bem as possíveis falhas do drive.
O drive-at-a-time tem o benefício de, no caso de uma falha de unidade única (que parece ser o erro físico mais preocupante que pode acontecer), os algoritmos de ordenação e sistemas de recuperação do MooseFS podem replicar dados não replicados e "ignorar" a unidade com falha.
O volume de cada vez tem o benefício de forçar a replicação de dados em servidores diferentes, mas não garante o uso de unidades de nível / nível individual .
Estas respostas vêm das respectivas listas de discussão para MooseFS e XtreemFS - apenas a gramática e a legibilidade foram melhoradas; links para tópicos originais fornecidos