Eu tenho o Apache agindo como um balanceador de carga na frente de 3 servidores Tomcat. Ocasionalmente, o Apache retorna 503 respostas, o que eu gostaria de remover completamente. Todos os 4 servidores não estão sob carga significativa em termos de CPU, memória ou disco, então estou um pouco inseguro sobre o que está atingindo seus limites ou por quê. 503s são retornados quando todos os trabalhadores estão em estado de erro - o que quer que isso signifique. Aqui estão os detalhes:
Configuração do Apache:
<IfModule mpm_prefork_module>
StartServers 30
MinSpareServers 30
MaxSpareServers 60
MaxClients 200
MaxRequestsPerChild 1000
</IfModule>
...
<Proxy *>
AddDefaultCharset Off
Order deny,allow
Allow from all
</Proxy>
# Tomcat HA cluster
<Proxy balancer://mycluster>
BalancerMember ajp://10.176.201.9:8009 keepalive=On retry=1 timeout=1 ping=1
BalancerMember ajp://10.176.201.10:8009 keepalive=On retry=1 timeout=1 ping=1
BalancerMember ajp://10.176.219.168:8009 keepalive=On retry=1 timeout=1 ping=1
</Proxy>
# Passes thru track. or api.
ProxyPreserveHost On
ProxyStatus On
# Original tracker
ProxyPass /m balancer://mycluster/m
ProxyPassReverse /m balancer://mycluster/m
Configuração do Tomcat:
<Server port="8005" shutdown="SHUTDOWN">
<Listener className="org.apache.catalina.core.AprLifecycleListener" SSLEngine="on" />
<Listener className="org.apache.catalina.core.JasperListener" />
<Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" />
<Listener className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener" />
<Service name="Catalina">
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" />
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />
<Engine name="Catalina" defaultHost="localhost">
<Host name="localhost" appBase="webapps"
unpackWARs="true" autoDeploy="true"
xmlValidation="false" xmlNamespaceAware="false">
</Engine>
</Service>
</Server>
Log de erros do Apache:
[Mon Mar 22 18:39:47 2010] [error] (70007)The timeout specified has expired: proxy: AJP: attempt to connect to 10.176.201.10:8009 (10.176.201.10) failed
[Mon Mar 22 18:39:47 2010] [error] ap_proxy_connect_backend disabling worker for (10.176.201.10)
[Mon Mar 22 18:39:47 2010] [error] proxy: AJP: failed to make connection to backend: 10.176.201.10
[Mon Mar 22 18:39:47 2010] [error] (70007)The timeout specified has expired: proxy: AJP: attempt to connect to 10.176.201.9:8009 (10.176.201.9) failed
[Mon Mar 22 18:39:47 2010] [error] ap_proxy_connect_backend disabling worker for (10.176.201.9)
[Mon Mar 22 18:39:47 2010] [error] proxy: AJP: failed to make connection to backend: 10.176.201.9
[Mon Mar 22 18:39:47 2010] [error] (70007)The timeout specified has expired: proxy: AJP: attempt to connect to 10.176.219.168:8009 (10.176.219.168) failed
[Mon Mar 22 18:39:47 2010] [error] ap_proxy_connect_backend disabling worker for (10.176.219.168)
[Mon Mar 22 18:39:47 2010] [error] proxy: AJP: failed to make connection to backend: 10.176.219.168
[Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state
[Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state
[Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state
[Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state
[Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state
[Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state
Balanceador de carga top
info:
top - 23:44:11 up 210 days, 4:32, 1 user, load average: 0.10, 0.11, 0.09
Tasks: 135 total, 2 running, 133 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.2%sy, 0.0%ni, 99.2%id, 0.1%wa, 0.0%hi, 0.1%si, 0.3%st
Mem: 524508k total, 517132k used, 7376k free, 9124k buffers
Swap: 1048568k total, 352k used, 1048216k free, 334720k cached
Tomcat top
info:
top - 23:47:12 up 210 days, 3:07, 1 user, load average: 0.02, 0.04, 0.00
Tasks: 63 total, 1 running, 62 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 0.0%sy, 0.0%ni, 99.8%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097372k total, 2080888k used, 16484k free, 21464k buffers
Swap: 4194296k total, 380k used, 4193916k free, 1520912k cached
Catalina.out não possui mensagens de erro.
De acordo com o status do servidor do Apache, parece estar chegando ao limite de 143 solicitações / seg. Acredito que os servidores suportem substancialmente mais carga do que eles, portanto, quaisquer sugestões sobre limites padrão baixos ou outras razões pelas quais essa configuração seria maximizada seriam muito apreciadas.