Assim, um provedor nos deu 500 IOPS / TB como seus padrões de SLA para desempenho de disco em um VMWare & Ambiente RAID5-SAN. Isto é aparentemente medido com:
- 16kB tamanho médio do bloco de transferência
- 3: 1 leitura: taxa de gravação
- Operações de E / S multithreaded
- modelagem aleatória de E / S de 80%
- Acertar o cache de 20%
O que eu quero fazer é determinar se alguma VM Linux específica está obtendo esse desempenho e, em seguida, executar o mesmo benchmark com outros provedores para que eu possa comparar.
Ao olhar em volta, parece que o fio é o mais configurável para medir o acima. A configuração que eu tenho até agora é:
[global]
blocksize=16k
rwmixread=75 # 3:1 read:write ratio
ramp_time=30
runtime=600
time_based
buffered=1
# size = free-ram * 80% / 5
# so we get a ~20% cache hit across the 5x processes
# this is for an 8GB ram host with 7.3GB free after buffers/cache
size=1180m
# create a mix to get to 80% random reads
# also means we'll be doing at least 5x IO operations in parallel
[sla-0]
readwrite=randrw:2
[sla-1]
readwrite=randrw:2
[sla-2]
readwrite=randrw
[sla-3]
readwrite=randrw
[sla-4]
readwrite=randrw
Sugestões de melhorias? Está usando buffered
e o padrão ioengine
é o melhor caminho a percorrer?
Se eu executar isso em uma máquina de núcleo virtual 4x ociosa, com 8 GB de RAM e 470 GB de armazenamento alocado, esperaria obter 235 IOPS pelo acima (500 * 0,47). Os resultados que obtenho são:
sla-0: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=sync, iodepth=2
sla-1: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=sync, iodepth=2
sla-2: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=sync, iodepth=2
sla-3: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=sync, iodepth=2
sla-4: (g=0): rw=randrw, bs=16K-16K/16K-16K, ioengine=sync, iodepth=2
Starting 5 processes
sla-0: Laying out IO file(s) (1 file(s) / 1180MB)
sla-1: Laying out IO file(s) (1 file(s) / 1180MB)
sla-2: Laying out IO file(s) (1 file(s) / 1180MB)
sla-3: Laying out IO file(s) (1 file(s) / 1180MB)
sla-4: Laying out IO file(s) (1 file(s) / 1180MB)
Jobs: 5 (f=5): [mmmmm] [100.0% done] [5931K/1966K /s] [362/120 iops] [eta 00m:00s]
sla-0: (groupid=0, jobs=1): err= 0: pid=16701
read : io=1086MB, bw=1853KB/s, iops=115, runt=600003msec
clat (usec): min=4, max=1771K, avg=8607.53, stdev=22114.44
bw (KB/s) : min= 0, max= 4087, per=24.44%, avg=1914.96, stdev=1130.29
write: io=372416KB, bw=635586B/s, iops=38, runt=600003msec
clat (usec): min=6, max=2574, avg=57.38, stdev=79.65
bw (KB/s) : min= 0, max=11119, per=26.07%, avg=679.63, stdev=517.84
cpu : usr=0.08%, sys=0.63%, ctx=64513, majf=0, minf=109
IO depths : 1=107.4%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=69474/23276, short=0/0
lat (usec): 10=10.23%, 20=8.89%, 50=4.15%, 100=11.66%, 250=0.83%
lat (usec): 500=1.48%, 750=1.41%, 1000=0.82%
lat (msec): 2=0.83%, 4=1.56%, 10=47.07%, 20=5.91%, 50=4.24%
lat (msec): 100=0.55%, 250=0.29%, 500=0.06%, 750=0.01%, 1000=0.01%
lat (msec): 2000=0.01%
sla-1: (groupid=0, jobs=1): err= 0: pid=16702
read : io=963360KB, bw=1605KB/s, iops=100, runt=600180msec
clat (usec): min=4, max=2396K, avg=9934.23, stdev=30986.37
bw (KB/s) : min= 0, max= 4657, per=21.64%, avg=1695.89, stdev=1273.00
write: io=326000KB, bw=556206B/s, iops=33, runt=600180msec
clat (usec): min=6, max=3882, avg=55.07, stdev=77.92
bw (KB/s) : min= 0, max=10708, per=23.74%, avg=618.92, stdev=559.01
cpu : usr=0.08%, sys=0.53%, ctx=55500, majf=0, minf=129
IO depths : 1=108.5%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=60210/20375, short=0/0
lat (usec): 10=11.36%, 20=9.63%, 50=3.56%, 100=11.97%, 250=0.81%
lat (usec): 500=0.66%, 750=0.50%, 1000=0.37%
lat (msec): 2=0.33%, 4=0.74%, 10=49.56%, 20=3.78%, 50=5.48%
lat (msec): 100=0.60%, 250=0.43%, 500=0.16%, 750=0.04%, 1000=0.01%
lat (msec): 2000=0.01%, >=2000=0.01%
sla-2: (groupid=0, jobs=1): err= 0: pid=16703
read : io=827584KB, bw=1379KB/s, iops=86, runt=600012msec
clat (usec): min=397, max=2396K, avg=11569.59, stdev=31237.03
bw (KB/s) : min= 0, max= 4237, per=18.60%, avg=1457.59, stdev=1113.89
write: io=276192KB, bw=471358B/s, iops=28, runt=600012msec
clat (usec): min=8, max=8339, avg=63.95, stdev=121.52
bw (KB/s) : min= 0, max= 8531, per=20.52%, avg=534.85, stdev=478.91
cpu : usr=0.07%, sys=0.54%, ctx=57019, majf=0, minf=89
IO depths : 1=109.9%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=51724/17262, short=0/0
lat (usec): 10=0.98%, 20=5.38%, 50=3.53%, 100=13.68%, 250=0.92%
lat (usec): 500=0.60%, 750=0.39%, 1000=0.22%
lat (msec): 2=0.24%, 4=2.26%, 10=59.15%, 20=4.90%, 50=6.28%
lat (msec): 100=0.78%, 250=0.48%, 500=0.18%, 750=0.03%, 1000=0.01%
lat (msec): 2000=0.01%, >=2000=0.01%
sla-3: (groupid=0, jobs=1): err= 0: pid=16704
read : io=865920KB, bw=1443KB/s, iops=90, runt=600005msec
clat (usec): min=369, max=2396K, avg=11052.97, stdev=32396.85
bw (KB/s) : min= 0, max= 5984, per=19.47%, avg=1525.97, stdev=1164.42
write: io=285568KB, bw=487365B/s, iops=29, runt=600005msec
clat (usec): min=7, max=11910, avg=65.72, stdev=154.09
bw (KB/s) : min= 0, max=11064, per=21.38%, avg=557.30, stdev=534.59
cpu : usr=0.07%, sys=0.57%, ctx=59458, majf=0, minf=109
IO depths : 1=109.5%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=54120/17848, short=0/0
lat (usec): 10=0.99%, 20=5.11%, 50=3.58%, 100=13.64%, 250=0.89%
lat (usec): 500=0.71%, 750=0.48%, 1000=0.30%
lat (msec): 2=0.70%, 4=4.00%, 10=57.63%, 20=5.21%, 50=5.40%
lat (msec): 100=0.70%, 250=0.43%, 500=0.16%, 750=0.03%, 1000=0.01%
lat (msec): 2000=0.01%, >=2000=0.01%
sla-4: (groupid=0, jobs=1): err= 0: pid=16705
read : io=934752KB, bw=1558KB/s, iops=97, runt=600007msec
clat (usec): min=187, max=2396K, avg=10236.87, stdev=26080.98
bw (KB/s) : min= 0, max=11419, per=20.74%, avg=1625.28, stdev=1338.26
write: io=304528KB, bw=519721B/s, iops=31, runt=600007msec
clat (usec): min=7, max=7572, avg=67.29, stdev=117.27
bw (KB/s) : min= 0, max=10772, per=22.06%, avg=575.17, stdev=560.68
cpu : usr=0.08%, sys=0.60%, ctx=63685, majf=0, minf=129
IO depths : 1=108.7%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=58422/19033, short=0/0
lat (usec): 10=0.81%, 20=4.77%, 50=3.62%, 100=13.77%, 250=0.97%
lat (usec): 500=1.45%, 750=0.64%, 1000=0.53%
lat (msec): 2=1.75%, 4=4.71%, 10=53.48%, 20=6.92%, 50=5.53%
lat (msec): 100=0.56%, 250=0.37%, 500=0.08%, 750=0.02%, 1000=0.01%
lat (msec): 2000=0.01%, >=2000=0.01%
Run status group 0 (all jobs):
READ: io=4593MB, aggrb=7836KB/s, minb=1412KB/s, maxb=1897KB/s, mint=600003msec, maxt=600180msec
WRITE: io=1528MB, aggrb=2607KB/s, minb=471KB/s, maxb=635KB/s, mint=600003msec, maxt=600180msec
Disk stats (read/write):
dm-0: ios=298995/596154, merge=0/0, ticks=3107720/433061790, in_queue=436170340, util=99.68%, aggrios=0/0, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
sdb: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=-nan%
Resumindo, leia e escreva IOPS para cada trabalho (por que isso não inclui isso em seu resumo?) Recebo 647, o que parece estar excedendo os níveis de serviço especificados. Qualquer coisa óbvia que esteja faltando, ou suas métricas são distorcidas massivamente para algumas cargas de trabalho (especificamente estou interessado no PostgreSQL com cargas de trabalho de data warehouse).