Problemas com o desempenho geral do ceph com o QEMU

2

Estou tendo alguns problemas de desempenho com meus KVMs do QEMU no meu cluster do Ceph. O cluster tem 4 nós com drives 4x1TB cada, 48 / 64GB de RAM, Intel Xeon e AMD Opterons. Eles são interconectados por interfaces 3x1 GBit configuradas como uma interface de ligação. O tráfego geral da rede agora está muito alto. De tempos em tempos, há OIs bloqueados e não sei exatamente por quê. Os hosts OSD e KVM são equipados com o Ubuntu 14.04 LTS e o Kernel 3.13.0. Existe um interruptor que eu esqueci de virar? Talvez você possa me ajudar com isso porque eu estou no fim da minha cabeça.

Um snippet do log com IOs bloqueados:

2015-11-10 08:03:52.597054 mon.0 10.14.0.6:6789/0 546966 : cluster [INF] HEALTH_WARN; 1 requests are blocked > 32 sec
2015-11-10 08:04:41.993675 osd.13 10.14.0.76:6814/5175 106 : cluster [WRN] 30 slow requests, 30 included below; oldest blocked for > 30.207798 secs
2015-11-10 08:04:42.993975 osd.13 10.14.0.76:6814/5175 112 : cluster [WRN] 32 slow requests, 27 included below; oldest blocked for > 31.208280 secs
2015-11-10 08:04:43.994367 osd.13 10.14.0.76:6814/5175 118 : cluster [WRN] 35 slow requests, 25 included below; oldest blocked for > 32.208673 secs
2015-11-10 08:04:44.994712 osd.13 10.14.0.76:6814/5175 124 : cluster [WRN] 25 slow requests, 16 included below; oldest blocked for > 33.205598 secs
2015-11-10 08:04:45.995052 osd.13 10.14.0.76:6814/5175 130 : cluster [WRN] 26 slow requests, 15 included below; oldest blocked for > 34.124413 secs
2015-11-10 08:04:46.995360 osd.13 10.14.0.76:6814/5175 136 : cluster [WRN] 24 slow requests, 11 included below; oldest blocked for > 35.124517 secs
2015-11-10 08:04:47.995689 osd.13 10.14.0.76:6814/5175 142 : cluster [WRN] 22 slow requests, 6 included below; oldest blocked for > 36.124712 secs
2015-11-10 08:04:48.996059 osd.13 10.14.0.76:6814/5175 148 : cluster [WRN] 9 slow requests, 1 included below; oldest blocked for > 37.122843 secs
2015-11-10 08:05:05.238556 osd.13 10.14.0.76:6814/5175 150 : cluster [WRN] 12 slow requests, 3 included below; oldest blocked for > 53.365283 secs
2015-11-10 08:05:09.683333 osd.13 10.14.0.76:6814/5175 154 : cluster [WRN] 16 slow requests, 4 included below; oldest blocked for > 57.809976 secs
2015-11-10 08:05:11.895482 osd.13 10.14.0.76:6814/5175 159 : cluster [WRN] 18 slow requests, 11 included below; oldest blocked for > 60.022206 secs
2015-11-10 08:05:13.730638 osd.13 10.14.0.76:6814/5175 165 : cluster [WRN] 21 slow requests, 8 included below; oldest blocked for > 61.857323 secs
2015-11-10 08:05:14.731015 osd.13 10.14.0.76:6814/5175 171 : cluster [WRN] 24 slow requests, 6 included below; oldest blocked for > 62.857742 secs
2015-11-10 08:05:15.731261 osd.13 10.14.0.76:6814/5175 177 : cluster [WRN] 35 slow requests, 12 included below; oldest blocked for > 63.857998 secs
2015-11-10 08:05:17.028076 osd.13 10.14.0.76:6814/5175 183 : cluster [WRN] 43 slow requests, 15 included below; oldest blocked for > 65.154773 secs
2015-11-10 08:05:18.127205 osd.13 10.14.0.76:6814/5175 189 : cluster [WRN] 45 slow requests, 12 included below; oldest blocked for > 66.253932 secs
2015-11-10 08:05:19.127468 osd.13 10.14.0.76:6814/5175 195 : cluster [WRN] 48 slow requests, 14 included below; oldest blocked for > 67.254104 secs
2015-11-10 08:05:20.127937 osd.13 10.14.0.76:6814/5175 201 : cluster [WRN] 52 slow requests, 14 included below; oldest blocked for > 68.254581 secs
2015-11-10 08:05:22.065629 osd.13 10.14.0.76:6814/5175 207 : cluster [WRN] 53 slow requests, 14 included below; oldest blocked for > 70.192250 secs
2015-11-10 08:05:23.065965 osd.13 10.14.0.76:6814/5175 213 : cluster [WRN] 57 slow requests, 13 included below; oldest blocked for > 71.192553 secs
2015-11-10 08:05:24.066355 osd.13 10.14.0.76:6814/5175 219 : cluster [WRN] 58 slow requests, 9 included below; oldest blocked for > 72.192932 secs
2015-11-10 08:05:25.066731 osd.13 10.14.0.76:6814/5175 225 : cluster [WRN] 61 slow requests, 7 included below; oldest blocked for > 73.193356 secs
2015-11-10 08:05:26.067590 osd.13 10.14.0.76:6814/5175 231 : cluster [WRN] 62 slow requests, 3 included below; oldest blocked for > 74.193947 secs
2015-11-10 08:05:27.067844 osd.13 10.14.0.76:6814/5175 235 : cluster [WRN] 63 slow requests, 1 included below; oldest blocked for > 75.194501 secs
2015-11-10 08:05:32.306675 osd.13 10.14.0.76:6814/5175 237 : cluster [WRN] 59 slow requests, 1 included below; oldest blocked for > 80.433195 secs
2015-11-10 09:13:46.210699 osd.2 10.14.0.75:6804/29163 46 : cluster [WRN] 34 slow requests, 34 included below; oldest blocked for > 30.810297 secs
2015-11-10 09:13:47.211462 osd.2 10.14.0.75:6804/29163 52 : cluster [WRN] 38 slow requests, 33 included below; oldest blocked for > 31.811420 secs
2015-11-10 09:13:48.211718 osd.2 10.14.0.75:6804/29163 58 : cluster [WRN] 40 slow requests, 30 included below; oldest blocked for > 32.811678 secs
2015-11-10 09:13:49.212002 osd.2 10.14.0.75:6804/29163 64 : cluster [WRN] 43 slow requests, 28 included below; oldest blocked for > 33.811957 secs
2015-11-10 09:13:50.213554 osd.2 10.14.0.75:6804/29163 70 : cluster [WRN] 45 slow requests, 25 included below; oldest blocked for > 34.812999 secs
2015-11-10 09:13:51.214046 osd.2 10.14.0.75:6804/29163 76 : cluster [WRN] 50 slow requests, 25 included below; oldest blocked for > 35.813991 secs
2015-11-10 09:13:52.215101 osd.2 10.14.0.75:6804/29163 82 : cluster [WRN] 49 slow requests, 21 included below; oldest blocked for > 36.813431 secs
2015-11-10 09:13:53.215519 osd.2 10.14.0.75:6804/29163 88 : cluster [WRN] 43 slow requests, 19 included below; oldest blocked for > 37.810298 secs
2015-11-10 09:13:54.215797 osd.2 10.14.0.75:6804/29163 94 : cluster [WRN] 19 slow requests, 7 included below; oldest blocked for > 37.922869 secs
2015-11-10 09:13:55.216838 osd.2 10.14.0.75:6804/29163 100 : cluster [WRN] 6 slow requests, 1 included below; oldest blocked for > 37.592385 secs
2015-11-10 09:13:56.217302 osd.2 10.14.0.75:6804/29163 102 : cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.036856 secs
2015-11-10 10:18:00.293677 osd.0 10.14.0.75:6800/28850 109 : cluster [WRN] 5 slow requests, 5 included below; oldest blocked for > 30.137196 secs
2015-11-10 10:18:02.295197 osd.0 10.14.0.75:6800/28850 115 : cluster [WRN] 3 slow requests, 3 included below; oldest blocked for > 30.225206 secs
2015-11-10 10:18:03.296209 osd.0 10.14.0.75:6800/28850 119 : cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.640530 secs

Aqui está o nosso momentâneo ceph.conf:

[global]
fsid = xxx
mon_initial_members = mon1 mon2 mon3
mon_host = 10.14.0.6
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd pool default size = 3
public network = 10.14.0.0/24
cluster network = 10.14.0.0/24
rbd default format = 2

[osd]
osd journal size = 10240
osd recovery max active = 1
osd max backfills = 1
filestore max sync interval = 30 # just for testing
filestore min sync interval = 29 # no impact detectable

Esta é a árvore do osd:

ID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 14.23999 root default                                     
-6  3.56000     host host1                                     
 8  0.89000         osd.8       up  1.00000          1.00000 
 9  0.89000         osd.9       up  1.00000          1.00000 
10  0.89000         osd.10      up  1.00000          1.00000 
11  0.89000         osd.11      up  1.00000          1.00000 
-2  3.56000     host host2                                     
 2  0.89000         osd.2       up  1.00000          1.00000 
 5  0.89000         osd.5       up  1.00000          1.00000 
 7  0.89000         osd.7       up  1.00000          1.00000 
 0  0.89000         osd.0       up  0.79143          1.00000 
-4  3.56000     host host3                                     
12  0.89000         osd.12      up  1.00000          1.00000 
13  0.89000         osd.13      up  1.00000          1.00000 
14  0.89000         osd.14      up  1.00000          1.00000 
15  0.89000         osd.15      up  1.00000          1.00000 
-3  3.56000     host host4                                     
 1  0.89000         osd.1       up  1.00000          1.00000 
 3  0.89000         osd.3       up  1.00000          1.00000 
 4  0.89000         osd.4       up  1.00000          1.00000 
 6  0.89000         osd.6       up  0.86749          1.00000

Este é o osd df:

ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR  
 8 0.89000  1.00000   916G  556G  359G 60.75 1.03 
 9 0.89000  1.00000   916G  564G  351G 61.61 1.05 
10 0.89000  1.00000   916G  514G  402G 56.12 0.95 
11 0.89000  1.00000   916G  510G  406G 55.68 0.95 
 2 0.89000  1.00000   916G  586G  329G 64.06 1.09 
 5 0.89000  1.00000   916G  456G  459G 49.85 0.85 
 7 0.89000  1.00000   915G  546G  368G 59.71 1.02 
 0 0.89000  0.79143   916G  615G  300G 67.16 1.14 
12 0.89000  1.00000   916G  472G  443G 51.61 0.88 
13 0.89000  1.00000   916G  628G  287G 68.60 1.17 
14 0.89000  1.00000   916G  540G  375G 59.01 1.00 
15 0.89000  1.00000   916G  596G  319G 65.15 1.11 
 1 0.89000  1.00000   916G  553G  362G 60.39 1.03 
 3 0.89000  1.00000   916G  462G  453G 50.53 0.86 
 4 0.89000  1.00000   916G  472G  443G 51.58 0.88 
 6 0.89000  0.86749   916G  540G  375G 58.99 1.00 
              TOTAL 14657G 8618G 6039G 58.80      
MIN/MAX VAR: 0.85/1.17  STDDEV: 5.67

Aqui está um exemplo de um QVM KVM:

<domain type='kvm'>
  <name>testvm</name>
  <uuid>xxx</uuid>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-trusty'>hvm</type>
    <bootmenu enable='yes'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>SandyBridge</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='pbe'/>
    <feature policy='require' name='tm2'/>
    <feature policy='require' name='est'/>
    <feature policy='require' name='vmx'/>
    <feature policy='require' name='osxsave'/>
    <feature policy='require' name='smx'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='ds'/>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='dtes64'/>
    <feature policy='require' name='ht'/>
    <feature policy='require' name='dca'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='tm'/>
    <feature policy='require' name='pdcm'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='ds_cpl'/>
    <feature policy='require' name='xtpr'/>
    <feature policy='require' name='acpi'/>
    <feature policy='require' name='monitor'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
      <auth username='admin'>
        <secret type='ceph' uuid='xxx'/>
      </auth>
      <source protocol='rbd' name='vms/testvm'>
        <host name='mon1' port='6789'/>
        <host name='mon2' port='6789'/>
        <host name='mon3' port='6789'/>
      </source>
      <target dev='sda' bus='scsi'/>
      <boot order='1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='xxx'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <boot order='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
</domain>
    
por devnull 10.11.2015 / 10:25

1 resposta

1

Essa pergunta é bem antiga, mas se alguém estiver se perguntando sobre problemas de desempenho, veja alguns pontos a serem observados:

  • 1 GB de rede não é recomendado. Nós começamos com isso e tivemos muitos pedidos lentos. A atualização para uma rede de 10 GBit resolveu alguns dos problemas de desempenho.
  • Use SSDs para seus OSDs (diário).
  • Use o Bluestore.
  • Experimente uma camada de cache para um pool RBD ao trabalhar com VMs, ganhamos muito com isso.
por 28.09.2018 / 10:37

Tags