Temos solicitações de balanceamento de carga do servidor apache para downloads em dois servidores diferentes. O log de erros do mod_jk é preenchido com o seguinte tipo de erros
[Mon Feb 13 16:59:02.948 2012] [19453:139726932305664] [error] ajp_process_callback::jk_ajp_common.c (1800): ajp_unmarshal_response failed
[Mon Feb 13 16:59:02.948 2012] [19453:139726932305664] [info] ajp_service::jk_ajp_common.c (2540): (mrxdf3) sending request to tomcat failed (recoverable), because of server error (attempt=1)
[Mon Feb 13 16:59:03.048 2012] [19453:139726932305664] [info] ajp_send_request::jk_ajp_common.c (1490): (mrxdf3) did not receive END_RESPONSE, closing socket -1
[Mon Feb 13 16:59:03.054 2012] [19453:139726932305664] [error] ajp_unmarshal_response::jk_ajp_common.c (646): NULL status
[Mon Feb 13 16:59:03.054 2012] [19453:139726932305664] [error] ajp_process_callback::jk_ajp_common.c (1800): ajp_unmarshal_response failed
[Mon Feb 13 16:59:03.054 2012] [19453:139726932305664] [info] ajp_service::jk_ajp_common.c (2540): (mrxdf3) sending request to tomcat failed (recoverable), because of server error (attempt=2)
[Mon Feb 13 16:59:03.054 2012] [19453:139726932305664] [error] ajp_service::jk_ajp_common.c (2559): (mrxdf3) connecting to tomcat failed.
[Mon Feb 13 16:59:03.054 2012] [19453:139726932305664] [info] service::jk_lb_worker.c (1388): service failed, worker mrxdf3 is in error state
[Mon Feb 13 16:59:03.159 2012] [19453:139726932305664] [error] ajp_unmarshal_response::jk_ajp_common.c (646): NULL status
[Mon Feb 13 16:59:03.159 2012] [19453:139726932305664] [error] ajp_process_callback::jk_ajp_common.c (1800): ajp_unmarshal_response failed
[Mon Feb 13 16:59:03.159 2012] [19453:139726932305664] [info] ajp_service::jk_ajp_common.c (2540): (mrxdf2) sending request to tomcat failed (recoverable), because of server error (attempt=1)
[Mon Feb 13 16:59:03.259 2012] [19453:139726932305664] [info] ajp_send_request::jk_ajp_common.c (1490): (mrxdf2) did not receive END_RESPONSE, closing socket -1
[Mon Feb 13 16:59:03.263 2012] [19453:139726932305664] [error] ajp_unmarshal_response::jk_ajp_common.c (646): NULL status
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [error] ajp_process_callback::jk_ajp_common.c (1800): ajp_unmarshal_response failed
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [info] ajp_service::jk_ajp_common.c (2540): (mrxdf2) sending request to tomcat failed (recoverable), because of server error (attempt=2)
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [error] ajp_service::jk_ajp_common.c (2559): (mrxdf2) connecting to tomcat failed.
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [info] service::jk_lb_worker.c (1388): service failed, worker mrxdf2 is in local error state
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [info] service::jk_lb_worker.c (1457): All tomcat instances failed, no more workers left (attempt=1, retry=1)
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [info] service::jk_lb_worker.c (1468): All tomcat instances are busy or in error state
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [error] service::jk_lb_worker.c (1473): All tomcat instances failed, no more workers left
[Mon Feb 13 16:59:03.264 2012] [19453:139726932305664] [info] jk_handler::mod_jk.c (2618): Service error=0 for worker=lb_df_ajp13
Nós monitoramos o status e ficamos um pouco preocupados com os vários servidores mudando de ok para erro para ok para erro .....
O problema é que a única indicação que temos de um erro é essa. Não há relatórios de erros reais. Os usuários não relatam downloads desfeitos ou não conseguem acessar os servidores.
este é o mod_jk conf
# Minimal jk configuration
JkWorkerProperty worker.list=ajp13,api_ajp13,app_ajp13,status_ajp13,lb_df_ajp13
# web server
JkWorkerProperty worker.ajp13.type=ajp13
JkWorkerProperty worker.ajp13.host=web0.live.mbuyu.nl
JkWorkerProperty worker.ajp13.port=8009
# app
JkWorkerProperty worker.app_ajp13.type=ajp13
JkWorkerProperty worker.app_ajp13.host=app0.live.mbuyu.nl
JkWorkerProperty worker.app_ajp13.port=8009
# api server
JkWorkerProperty worker.api_ajp13.type=ajp13
JkWorkerProperty worker.api_ajp13.host=api0.live.mbuyu.nl
JkWorkerProperty worker.api_ajp13.port=8009
# DF Node0
JkWorkerProperty worker.mrxdf2.type=ajp13
JkWorkerProperty worker.mrxdf2.host=df2.live.mbuyu.nl
JkWorkerProperty worker.mrxdf2.port=8009
JkWorkerProperty worker.mrxdf2.lbfactor=1
# DF Node1
JkWorkerProperty worker.mrxdf3.type=ajp13
JkWorkerProperty worker.mrxdf3.host=df3.live.mbuyu.nl
JkWorkerProperty worker.mrxdf3.port=8009
JkWorkerProperty worker.mrxdf3.lbfactor=1
# JK Status worker
JkWorkerProperty worker.status_ajp13.type=status
# Load-balancer
JkWorkerProperty worker.lb_df_ajp13.type=lb
JkWorkerProperty worker.lb_df_ajp13.balanced_workers=mrxdf2,mrxdf3
JkWorkerProperty worker.lb_df_ajp13.sticky_session=1
JkWorkerProperty worker.lb_df_ajp13.local_worker_only=1
Devemos estar preocupados? Podemos nos livrar disso?
mod_jk version = 1.2.30
apache 2.2.16
Os servidores de download executam o JBoss 6.1.
btw, na verdade eu sou um desenvolvedor e não um administrador de sistemas, mas me pedem ajuda desse lado ocasionalmente. Neste caso, o administrador de sistema quer saber o que há de errado com a aplicação do curso. Tudo o que posso dizer é nada que eu possa encontrar internamente ou externamente. Estamos agora na situação muito negativa de ignorar um aviso de erro porque achamos que nada está realmente dando errado.