O Kernel do Linux 4.20 adicionou PSI , que significa "informação sobre a perda de pressão". Dá-lhe mais insights porque uma máquina está sobrecarregada. E qual recurso é o gargalo.
Existem três novos arquivos em /proc/pressure
:
-
/proc/pressure/cpu
-
/proc/pressure/memory
-
/proc/pressure/io
Para citar Controlando informações sobre a barra de pressão sobre /proc/pressure/memory
:
Its output looks like:
some avg10=70.24 avg60=68.52 avg300=69.91 total=3559632828 full avg10=57.59 avg60=58.06 avg300=60.38 total=3300487258
The
some
line is similar to the CPU information: it tracks the percentage of the time that at least one process could be running if it weren't waiting for memory resources. In particular, the time spent for swapping in, refaulting pages from the page cache, and performing direct reclaim is tracked in this way. It is, thus, a good indicator of when the system is thrashing due to a lack of memory.The
full
line is a little different: it tracks the time that nobody is able to use the CPU for actual work due to memory pressure. If all processes are waiting for paging I/O, the CPU may look idle, but that's not because of a lack of work to do. If those processes are performing memory reclaim, the end result is nearly the same; the CPU is busy, but it's not doing the work that the computer is there to do. If thefull
numbers are much above zero, it's clear that the system lacks the memory it needs to support the current workload.
Ainda não tenho acesso a um servidor de produção com o Linux 4.20, mas aqui está uma pequena experiência na minha área de trabalho (que não tem swap configurada). Inicialmente, não tenho pressão de memória (todos os contadores são 0):
$ cat /proc/pressure/memory
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
Depois, aumentei o uso de memória até ficar sem memória, o que congelou a máquina até que a OOM matasse alguns processos. Antes de congelar, a pressão na memória aumentou:
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
some avg10=0.00 avg60=0.00 avg300=0.00 total=47047
full avg10=0.00 avg60=0.00 avg300=0.00 total=32839
some avg10=0.00 avg60=0.00 avg300=0.00 total=116425
full avg10=0.00 avg60=0.00 avg300=0.00 total=81497
some avg10=1.26 avg60=0.22 avg300=0.04 total=183863
full avg10=0.72 avg60=0.13 avg300=0.02 total=127684
Agora, depois que o sistema tiver se recuperado, a pressão da memória será novamente 0 e os contadores total
não aumentarão mais:
$ cat /proc/pressure/memory
some avg10=0.00 avg60=0.00 avg300=0.07 total=53910568
full avg10=0.00 avg60=0.00 avg300=0.02 total=27766222
...
$ cat /proc/pressure/memory
some avg10=0.00 avg60=0.00 avg300=0.05 total=53910568
full avg10=0.00 avg60=0.00 avg300=0.00 total=27766222