XenServer 6.2 Read Only e Slow Disk I / O

1

Eu virtualizei um datacenter há alguns meses e temos um pool de três servidores HP DL360 G5, cada um com 32 GB de memória e dois Intel Xeons. Recentemente, temos tido 2 problemas, o primeiro dos quais é a velocidade de leitura do disco que se tornou extremamente lenta. Digitar "ls" em uma VM linux que tenha apenas alguns arquivos leva vários segundos para retornar uma lista de arquivos. Além disso, as VMs no cluster às vezes serão remontadas como sistemas de arquivos somente leitura por si mesmos. Dmesg nos hosts produz uma infinidade de erros "DRDY ERR". Os principais repositórios de armazenamento que usamos estão em um Drobo B800i, compartilhado sobre isci. Eu postei iostat e um grep dos erros DRDY do dmesg abaixo, estes são servidores corporativos e eles estão indo para baixo de forma intermitente, o que nunca é bom:

Aqui está um Iostat de um dos servidores: [root @ XenServer-1 tmp] # iostat Linux 2.6.32.43-0.4.1.xs1.8.0.835.170778xen (XenServer-1.ethoplex.com) 31/07/2014

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.42    0.00    0.46    3.51    0.40   95.21

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cciss/c0d0       17.30        76.54       304.24  893755376 3552874247
cciss/c0d0p1      1.04         0.27        22.82    3169526  266433488
cciss/c0d0p2      0.00         0.01         0.00      73890          0
cciss/c0d0p3     16.25        76.24       281.43  890365720 3286440759
sda              76.84        59.78        87.32  698047689 1019733585
dm-0              0.68         0.95         0.28   11071656    3217737
sdb               3.44       177.64        37.74 2074378210  440737634
dm-2              0.00         0.01         0.00     135808       2216
dm-3             12.23       361.61       131.55 4222728781 1536204287
sdc               4.05        27.93       328.02  326147810 3830552980
sdd               6.23       101.72       113.03 1187808537 1319897350
tda               1.61         9.74        40.01  113749658  467248640
dm-28             0.84        36.78        23.11  429521222  269838659
dm-14             0.24        56.24         0.00  656723598          0
dm-21             0.08        18.17         0.00  212172507          0
tdb               0.08         0.12         1.44    1384368   16853616
dm-5              0.38         4.03        36.17   47063052  422416430
tdc               0.61         4.03        36.10   47062722  421602000
dm-7              1.26        17.74         5.51  207110960   64292628
tde               1.22        17.64         5.49  206019946   64129696
dm-30             0.03         0.01         0.60      61956    6979438
dm-4              0.02         0.00         8.85       1014  103326613
tdd               0.11         0.00         8.82       1264  103049216
dm-9              0.00         0.02         0.05     175978     591472
tdg               0.00         0.02         0.05     175950     590704
dm-10             0.01         0.09         0.21    1104226    2488947
tdf               0.01         0.09         0.21    1105562    2472346
dm-6              0.00         0.00         0.04       1568     419135
dm-16             0.00         0.01         0.00     132105          0
dm-17             0.03         0.05         0.76     625890    8867990
dm-8              0.00         0.06         0.10     752923    1226072
tdh               0.00         0.07         0.10     788356    1218922
tdi               0.00         0.00         0.00        884          0

Dmesg Grep DRDY:

[11645348.631020] ata1.00: status: { DRDY ERR }
[11646434.714902] ata1.00: status: { DRDY ERR }
[11648427.773389] ata1.00: status: { DRDY ERR }
[11648950.139954] ata1.00: status: { DRDY ERR }
[11649612.475350] ata1.00: status: { DRDY ERR }
[11650177.522603] ata1.00: status: { DRDY ERR }
[11650649.818020] ata1.00: status: { DRDY }
[11651837.989833] ata1.00: status: { DRDY ERR }
[11654729.414605] ata1.00: status: { DRDY ERR }
[11655685.782290] ata1.00: status: { DRDY ERR }
[11657120.774143] ata1.00: status: { DRDY ERR }
[11659704.724995] ata1.00: status: { DRDY }
[11661322.210812] ata1.00: status: { DRDY ERR }
[11662029.088563] ata1.00: status: { DRDY ERR }
[11663314.187972] ata1.00: status: { DRDY ERR }
[11667978.796829] ata1.00: status: { DRDY ERR }
[11670487.088008] ata1.00: status: { DRDY ERR }
[11671800.577054] ata1.00: status: { DRDY ERR }

Dmesg:

[11464689.083861] sr 1:0:0:0: CDB: Get event status notification: 4a 01 00 00 10 00 00 00 08 00
[11464689.083875] ata1.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
[11464689.083876]res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[11464689.083896] ata1.00: status: { DRDY }
[11464694.133755] ata1: link is slow to respond, please be patient (ready=0)
[11464699.123711] ata1: device not ready (errno=-16), forcing hardreset
[11464699.123727] ata1: soft resetting link
[11464699.344063] ata1.00: configured for PIO0
[11464699.348375] ata1: EH complete
[11464706.383733] ata1.00: qc timeout (cmd 0xa0)
[11464706.383766] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[11464706.383782] sr 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
[11464706.383794] ata1.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[11464706.383795]res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
[11464706.383806] ata1.00: status: { DRDY ERR }
[11464711.433625] ata1: link is slow to respond, please be patient (ready=0)
[11464716.433591] ata1: device not ready (errno=-16), forcing hardreset
    
por Riley 01.08.2014 / 00:26

0 respostas