Estou executando um SmartMachine no Joyent. Acredito que estes são algum tipo de máquina virtual executando o Solaris. Nós temos uma aplicação web nesta máquina rodando em Apache, PHP e MySQL. Ele lida muito bem com nossa quantidade moderada de tráfego. No entanto todas as noites desde que fomos ao vivo. O site começará a retornar 403 erros proibidos até que o Apache seja reiniciado. Olho rápido no log de erros do Apache revela o seguinte:
[Tue Oct 26 23:13:00 2010] [error] server reached MaxClients setting, consider raising the MaxClients setting
[Wed Oct 27 13:09:40 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:40 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:40 2010] [error] [client 98.25.133.36] PHP Fatal error: Unknown: Failed opening required '/home/jill/web/content/index.php' (include_path='.:/opt/local/lib/php') in Unknown on line 0, referer: https://[redacted]/presentations/present#
[Wed Oct 27 13:09:42 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:42 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:42 2010] [crit] [client 68.193.4.75] (24)Too many open files: /home/jill/web/content/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable, referer: https://[redacted]/presentations/present#
[Wed Oct 27 13:09:43 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:43 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:43 2010] [crit] [client 72.28.224.201] (24)Too many open files: /home/jill/web/content/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable, referer: https://[redacted]/presentations/present#
[Wed Oct 27 13:09:44 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for reading (fetch)
[Wed Oct 27 13:09:44 2010] [error] (24)Too many open files: Cannot open SSLSessionCache DBM file '/var/run/ssl_scache' for writing (store)
[Wed Oct 27 13:09:44 2010] [crit] [client 72.28.224.201] (24)Too many open files: /home/jill/web/content/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable, referer: https://[redacted]/presentations/present#
As últimas três linhas são repetidas para cada solicitação feita ao servidor. Eu estou realmente em uma perda de como evitar que isso aconteça. Eu tentei aumentar o número de arquivos que podem ser abertos usando o prctl, mas não devo usá-lo corretamente porque o prctl retorna 1.02K para o básico quando tentei configurá-lo para 65.5K. Eu nem tenho certeza se essa é uma solução sólida:
prctl -i process -n process.max-file-descriptor 'pgrep httpd'
process: 18284: /opt/local/sbin/httpd -k start
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 1.02K - deny 18284
privileged 65.5K - deny -
system 2.15G max deny -
process: 18285: /opt/local/sbin/httpd -k start
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
process.max-file-descriptor
basic 1.02K - deny 18285
privileged 65.5K - deny -
system 2.15G max deny -
Então, qual é a melhor maneira de rastrear e corrigir um problema como este?
UPDATE
aqui está a saída de pfiles para o processo de httpd raiz.
[root@fe5txrad ~]# pfiles 18269
18269: /opt/local/sbin/httpd -k start
Current rlimit: 1024 file descriptors
0: S_IFCHR mode:0666 dev:304,8 ino:3020727013 uid:0 gid:3 rdev:13,2
O_RDONLY
/dev/null
1: S_IFCHR mode:0666 dev:304,8 ino:3020727013 uid:0 gid:3 rdev:13,2
O_WRONLY|O_CREAT|O_TRUNC
/dev/null
2: S_IFREG mode:0640 dev:182,65550 ino:362926 uid:0 gid:0 size:20551848
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE
/var/log/httpd/error.log
3: S_IFDOOR mode:0444 dev:313,0 ino:38 uid:0 gid:0 size:0
O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[18176]
/var/run/name_service_door
4: S_IFSOCK mode:0666 dev:311,0 ino:43693 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
SOCK_STREAM
SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49152)
sockname: AF_INET 0.0.0.0 port: 80
5: S_IFSOCK mode:0666 dev:311,0 ino:42512 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
SOCK_STREAM
SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49152)
sockname: AF_INET 0.0.0.0 port: 443
6: S_IFIFO mode:0000 dev:301,0 ino:8763127 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK FD_CLOEXEC
7: S_IFIFO mode:0000 dev:301,0 ino:8763127 uid:0 gid:0 size:0
O_RDWR FD_CLOEXEC
8: S_IFREG mode:0640 dev:182,65550 ino:362927 uid:0 gid:0 size:1450493
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE FD_CLOEXEC
/var/log/httpd/access.log
9: S_IFREG mode:0644 dev:182,65550 ino:369102 uid:1000 gid:1000 size:528239971
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE FD_CLOEXEC
/home/jill/logs/access_log
10: S_IFREG mode:0644 dev:182,65550 ino:369102 uid:1000 gid:1000 size:528239971
O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE FD_CLOEXEC
/home/jill/logs/access_log
11: S_IFREG mode:0644 dev:308,39 ino:3386326219 uid:0 gid:0 size:0
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE FD_CLOEXEC
12: S_IFREG mode:0644 dev:308,39 ino:3088492558 uid:0 gid:0 size:0
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE FD_CLOEXEC
advisory write lock set by process 7350
13: S_IFSOCK mode:0666 dev:311,0 ino:6452 uid:0 gid:0 size:0
O_RDWR FD_CLOEXEC
SOCK_STREAM
SO_SNDBUF(16384),SO_RCVBUF(5120)
sockname: AF_UNIX