libvirt erro ao ativar as hugepages para o convidado

3

Estou tentando configurar uma VM usando libvirt e KVM através de virt-manager (e virsh linha de comando) com suporte a abraços e recebo um erro ao habilitar a opção no domínio XML. Não sei onde está o problema.

Estou usando o Ubuntu 14.04 atualizado para 14.10 com as seguintes versões do pacote:

  • libvirt-bin 1.2.8-0ubuntu11
  • qemu-kvm 2.1 + dfsg-4ubuntu6

Específicos

Eu configurei as páginas ampliadas seguindo este guia . Aqui estão algumas informações sobre a configuração atual:

$ hugeadm --explain
Total System Memory: 15808 MB

Mount Point          Options
/dev/hugepages       rw,relatime,mode=1770,gid=126

Huge page pools:
      Size  Minimum  Current  Maximum  Default
   2097152     2176     2176     2176        *

Huge page sizes with configured pools:
2097152

$ getent group kvm
kvm:x:126:chaicko

$ cat /proc/meminfo | grep Huge
AnonHugePages:    591872 kB
HugePages_Total:    2176
HugePages_Free:     2176
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

Este é o domínio XML:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>win8</name>
  <uuid>b85bbb9a-745f-4293-a990-1e1726240ef0</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-utopic'>hvm</type>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>Haswell</model>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/vmvg/win8'/>
      <target dev='vda' bus='virtio'/>
      <boot order='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/home/chaicko/Downloads/virtio-win-0.1-81.iso'/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/home/chaicko/Downloads/WINDOWS_8.1_Pro_X64/Windows_8.1_Pro_X64.iso'/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <boot order='1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='scsi' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:48:ca:09'/>
      <source network='default'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' autoport='yes'/>
    <sound model='ich6'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </sound>
    <video>
      <model type='qxl' ram='65536' vram='65536' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </hostdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-drive'/>
    <qemu:arg value='if=pflash,readonly,format=raw,file=/usr/share/qemu/OVMF.fd'/>
  </qemu:commandline>
</domain>

Perguntas

Se eu remover a opção <memoryBacking> , ela funcionará, mas se não falhar, com o seguinte erro:

error: internal error: process exited while connecting to monitor: 

Também descomentei a seguinte linha em /etc/libvirt/qemu.conf :

hugetlbfs_mount = "/dev/hugepages"

A execução do qemu através do shell instruindo a usar as páginas do abraço realmente funciona ( -mem-path /dev/hugepages ).

O que estou fazendo de errado? Qualquer ajuda é apreciada.

    
por chaicko 22.10.2014 / 02:53

2 respostas

1

Respondendo a minha própria pergunta, para usar as hugepages com libvirt no Ubuntu , você só precisa definir KVM_HUGEPAGES=1 no arquivo /etc/default/qemu-kvm e reiniciar.

Isto está relacionado com este bugfix .

    
por 22.10.2014 / 06:22
2

De acordo com a documentação da libvirt, nesta seção intitulada: Backing de memória .

The optional memoryBacking element may contain several elements that influence how virtual memory pages are backed by host pages.

hugepages

This tells the hypervisor that the guest should have its memory allocated using hugepages instead of the normal native page size. Since 1.2.5 it's possible to set hugepages more specifically per numa node. The page element is introduced. It has one compulsory attribute size which specifies which hugepages should be used (especially useful on systems supporting hugepages of different sizes). The default unit for the size attribute is kilobytes (multiplier of 1024). If you want to use different unit, use optional unit attribute. For systems with NUMA, the optional nodeset attribute may come handy as it ties given guest's NUMA nodes to certain hugepage sizes. From the example snippet, one gigabyte hugepages are used for every NUMA node except node number four. For the correct syntax see this.

nosharepages

Instructs hypervisor to disable shared pages (memory merge, KSM) for this domain. Since 1.0.6

locked

When set and supported by the hypervisor, memory pages belonging to the domain will be locked in host's memory and the host will not be allowed to swap them out. For QEMU/KVM this requires hard_limit memory tuning element to be used and set to the maximum memory configured for the domain plus any memory consumed by the QEMU process itself. Since 1.0.6

Exemplo

<domain>
  ...
  <memoryBacking>
    <hugepages>
      <page size="1" unit="G" nodeset="0-3,5"/>
      <page size="2" unit="M" nodeset="4"/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  ...
</domain> 

Como você não diz, estou assumindo que você deseja alocar toda a memória para esse convidado específico. Se assim você provavelmente poderia tentar omitir esta seção completamente.

Direções alternativas

Eu encontrei estes RHEL 5 & 6 etapas neste artigo intituladas: Como configuro os convidados do KVM para usar o HugePages? que mostra como configurá-lo como segue:

trecho

Mount the HugeTLB filesystem on the host

You may use any mountpoint desired, here we have created /hugepages

  mkdir -p /hugepages
  mount -t hugetlbfs hugetlbfs /hugepages

This is also possible via an entry in /etc/fstab, for example

  hugetlbfs    /hugepages    hugetlbfs    defaults    0 0

Increase the memory lock limit on the host

Alter the following values in /etc/security/limits.conf depending on your required memory usage

  # Lock max 8Gb
  soft memlock 8388608
  hard memlock 8388608

Reserve HugePages and give the KVM group access to them

Alter to following lines in /etc/sysctl.conf depending on your required memory usage

  vm.nr_hugepages = 4096
  vm.hugetlb_shm_group = 36

Add HugePage backing to the KVM guest definition

Add the following to the guest config of an existing KVM guest. This can be done with virsh edit <guestname> or virsh define <guest.xml>

  <memoryBacking>
      <hugepages/>
  </memoryBacking>

Restart the host

This is required to re-allocate contigous memory to HugePages

Start a guest

Confirm the guest has HugePage backing Check the qemu-kvm process associated with that guest for the presence of -mem-path in the run command

  ps -ef | grep qemu

  root      4182     1  1 17:35 ?        00:00:42 /usr/libexec/qemu-kvm -S -M rhel5.4.0 -m 1024 -mem-prealloc
-mem-path /hugepages/libvirt/qemu -smp 1 -name vm1 -uuid 3f1f3a98-89f8-19ac-b5b5-bf496e2ed9be -no-kvm-pit-reinjection
-monitor pty -pidfile /var/run/libvirt/qemu//vm1.pid -boot c -drive file=/vmimages/vm1,if=ide,index=0,boot=on,cache=none
-drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:00:00:01,vlan=0 -net tap,fd=15,script=,vlan=0,ifname=vnet0
-serial pty -parallel none -usb -vnc 127.0.0.1:0 -k en-us

Confirm HugePage use on the system

Here we can see HugePages are being allocated at startup, as well as used/reserved for the guests

  cat /proc/meminfo | grep Huge

  HugePages_Total:    4096
  HugePages_Free:      873
  HugePages_Rsvd:      761
  Hugepagesize:       2048 kB

Root Cause

The default method of allocating memory for KVM guests is to use regular 4k pages. This can result in

  • large page tables which occupy unnecessary and inefficient amounts of memory
  • increased memory fragmentation which can slow down some kernel-based actions which require contigous memory (eg: disk writes, network access)
  • increasing page faults which can slow down all applications
  • risking swapping components of virtual guests out to disk which would cause a large performance hit

Using HugePages, page table sizes are dramatically reduced, contigous areas of memory are mapped, and HugePages cannot be swapped by design.

Note: These steps are not necessary with KVM on RHEL6, which uses Transparent HugePages to dynamically map contigous 2Mb areas of memory but also allows that memory to be broken up into 4k pages to be merged with KSM or swapped when the system is under memory pressure.

The above steps can be applied to RHEL6 if HugePages are desired over Transparent HugePages.

    
por 22.10.2014 / 04:19

Tags