Esse problema do que eu posso dizer é isolado para o PowerDNS. Os servidores estão executando dois pacotes pdns-static-3.0.1-1.i386.rpm
e pdns-recursor-3.3-1.i386.rpm
na versão mais recente do Amazon Linux.
Os loadbalancers amazon ec2 recebem um CNAME com vários hosts. Abaixo está um exemplo do comportamento real. Observe como os hosts estão sempre na mesma ordem.
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
O comportamento esperado é round robin para os hosts
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
Os endereços acabam trocando, mas parece que está em um temporizador de cache de 30 minutos, mas alterar o TTL do registro não parece afetar nada. Parece que o resolvedor tem um cache da resposta. Isso afeta negativamente meu aplicativo porque toda a carga está sendo enviada apenas para um dos balanceadores de carga (Zonas de disponibilidade), portanto, se eu tiver servidores em duas zonas, apenas uma zona estará sob carga de cada vez.
Você sabe como posso corrigir isso para que, sempre que o host for resolvido, a ordem dos endereços seja alternada.
DIG OUTPUT
; DiG 9.7.6-P1-RedHat-9.7.6-1.P1.18.amzn1 cache.domain.com
;; global options: +cmd
;; Got answer:
;; HEADER opcode: QUERY, status: NOERROR, id: 54610
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
cache.domain.com. IN A
;; ANSWER SECTION:
cache.domain.com. 100 IN CNAME xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com. 3 IN A aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com. 3 IN A bbb.bbb.bbb.bbb
;; Query time: 0 msec
;; SERVER: ccc.ccc.ccc.ccc#53(ccc.ccc.ccc.ccc)
;; WHEN: Mon Jul 2 15:09:27 2012
;; MSG SIZE rcvd: 130
Configuração do recursor
allow-from=0.0.0.0/0
dont-query=
local-address=127.0.0.1
local-port=530 # Port should be changed to 530 because its not good to run on the same port as dns server
quiet=yes
setgid=pdns
setuid=pdns
disable-packetcache=
packetcache-ttl=0
forward-zones=domain.local=LOCALIP,domain.cloud=LOCALIP # Forward the two zones we care about back to the local dns server
forward-zones-recurse=amazonaws.com=172.16.0.23,compute-1.internal=172.16.0.23 # Forward queries for amazons domains to the resolver for amazon
SOLUÇÃO
adicione as seguintes linhas ao recursor.conf
disable-packetcache=
packetcache-ttl=0
adicione a seguinte linha ao pdns.conf
recursive-cache-ttl=0