Estou executando um servidor web apache dedicado no CentOS6 que possui:
Memória de 12GB
4 cpus
Minha configuração httpd é a seguinte, em /etc/httpd/conf/httpd.conf
Timeout 60
KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15
O servidor está usando o Prefork MPM com as seguintes configurações.
<IfModule prefork.c>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 50
MaxClients 50
MaxRequestsPerChild 300
</IfModule>
Isso foi ajustado a partir de uma configuração anterior de:
<IfModule prefork.c>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000
</IfModule>
O PROBLEMA
Depois de alterar esses valores algumas vezes e reiniciar o httpd, a mesma coisa sempre acontece. Um dos PIDs tem até 11 GB de memória, então eu mato manualmente e a memória é liberada.
Aqui está minha saída livre após matar um PID desonesto e reiniciar:
total used free shared buffers cached
Mem: 11891 1132 10759 0 34 417
-/+ buffers/cache: 679 11212
Swap: 3827 227 3600
Parece que meus httpd PIDs estão tirando muita memória.
top - 11:26:58 up 1:20, 1 user, load average: 2.87, 2.43, 4.25
Tasks: 174 total, 12 running, 162 sleeping, 0 stopped, 0 zombie
Cpu(s): 83.3%us, 4.4%sy, 0.0%ni, 10.4%id, 1.7%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 12177272k total, 4746532k used, 7430740k free, 50012k buffers
Swap: 3919840k total, 227860k used, 3691980k free, 776180k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3848 apache 20 0 653m 216m 8908 R 31.3 1.8 0:11.77 httpd
3862 apache 20 0 627m 194m 6824 R 30.3 1.6 0:09.62 httpd
3846 apache 20 0 562m 126m 8728 R 28.6 1.1 0:10.20 httpd
3844 apache 20 0 638m 204m 6704 R 27.9 1.7 0:08.40 httpd
3911 apache 20 0 638m 203m 5880 R 27.6 1.7 0:04.92 httpd
3880 apache 20 0 639m 205m 6724 R 27.3 1.7 0:06.76 httpd
3918 apache 20 0 680m 246m 5820 S 25.3 2.1 0:05.38 httpd
3921 apache 20 0 630m 197m 4440 R 24.0 1.7 0:01.63 httpd
3843 apache 20 0 624m 191m 6684 R 23.3 1.6 0:07.50 httpd
2317 mysql 20 0 2463m 116m 3584 S 23.0 1.0 20:07.12 mysqld
3907 apache 20 0 620m 187m 5856 R 23.0 1.6 0:04.35 httpd
3927 apache 20 0 623m 188m 3624 R 22.6 1.6 0:00.68 httpd
3906 apache 20 0 628m 195m 6080 R 20.6 1.6 0:04.27 httpd
3908 apache 20 0 556m 123m 5880 S 8.0 1.0 0:02.79 httpd
3917 apache 20 0 536m 104m 6064 S 2.7 0.9 0:00.39 httpd
3909 apache 20 0 554m 121m 6120 S 2.3 1.0 0:01.19 httpd
3915 apache 20 0 614m 182m 6372 S 2.3 1.5 0:02.51 httpd
3849 apache 20 0 576m 144m 6700 S 1.7 1.2 0:06.54 httpd
3838 root 20 0 508m 84m 12m S 0.3 0.7 0:00.27 httpd
3931 apache 20 0 509m 76m 4216 S 0.3 0.6 0:00.01 httpd
E então um deles sempre sai e usa TODA a memória.
4076 apache 20 0 7834m 7.2g 6172 S 11.6 62.2 0:34.86 httpd
Memória MYSQL
[root@xxx ~]# ps aux | grep 'mysql' | awk '{print $6}'
4
148316
884
Tamanho do processo APACHE
[root@xxx ~]# ps aux | grep 'httpd' | awk '{print $6}'
93640
73196
133840
204352
170620
202056
120312
123600
123492
119048
131316
119744
200304
203160
118468
189300
203196
200024
124184
880
NETSTAT no pico (quando o PID desonesto consumiu toda a memória)
[root@xxx ~]# netstat -plan | grep :80
tcp 0 0 :::80 :::* LISTEN 4235/httpd
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:47089 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:59089 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:35831 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:42075 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:49612 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:43970 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:173.199.114.19:44220 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:34405 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:33963 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:66.249.75.222:42306 ESTABLISHED 4241/httpd
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:173.199.115.67:53145 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:91.232.96.34:64675 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:107.20.53.252:40552 ESTABLISHED 4247/httpd
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:173.199.114.19:46658 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:46954 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:74.208.104.107:33988 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:51501 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:208.167.230.35:52628 TIME_WAIT -
tcp 0 0 ::ffff:74.208.104.107:80 ::ffff:74.208.104.107:33989 TIME_WAIT -
O servidor tem algumas finalidades diferentes:
Executa um monte de sites do Wordpress.
Tarefas CRON para diferentes tarefas automatizadas (eu matei a maioria deles e o problema ainda acontece)
Tem aplicações móveis conectando-se a ele via POST.
PERGUNTAS:
1) What ELSE can I do to troubleshoot this problem?
2) How do I find WHICH PHP script might be running to take up all the memory? Perhaps this might be some runaway script that I haven't isolated.
3) Am I editing the right file for my httpd config, /etc/httpd/conf/httpd.conf?
4) What are the optimal settings for my server?
5) Could there be some kind of attack or rootkit thing happening?
EDITAR
Eu impus 2 limites ulimit permanentes usando /etc/security/limits.conf para o maxlock e como (memória virtual) para 1048576KB (1GB). Isso parece ter parado o comportamento, mas acredito que seja apenas uma medida paliativa. Eu ainda preciso de algumas respostas.
Rick