mongo clientes falham em 25% do tempo com “mensagem… é muito grande”

3

Estou tentando usar o haproxy no modo tcp na frente de um mongo servidor. Na máquina haproxy eu tenho um cliente mongo para teste com.

Ao conectar da máquina haproxy diretamente ao servidor mongo funciona 100%

Quando me conecto da máquina haproxy ao servidor mongo usando o haproxy, ele não consegue negociar um bom mongo conexão aproximadamente 25% do tempo. Cliente mongo diz recv (): mensagem len 1347703880 é muito grande. Max é 48000000

Isso não parece ser um problema com o cliente mongo ou servidor desde a conexão diretamente funciona 100% do tempo.

servidores no cenário:

     10.5.198.10     haproxy and mongo client for testing
     10.5.20.20       mongo server running port 17010

informações da versão / HA Proxy machine & cliente mongo

    OS: Debian Jessie
    SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) x86_64 GNU/Linux

    bluebrick@ip-10-5-198-10:~$ mongo --version
    MongoDB shell version: 2.4.10

    root@ip-10-5-198-10:~/tests/pmongo# haproxy -vv
    HA-Proxy version 1.6.3 2015/12/25
    Copyright 2000-2015 Willy Tarreau <[email protected]>
    Build options :
      TARGET  = linux2628
      CPU     = generic
      CC      = gcc
      CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
      OPTIONS = 
    Default settings :
      maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
    Encrypted password support via crypt(3): yes
    Built without compression support (neither USE_ZLIB nor USE_SLZ are set)
    Compression algorithms supported : identity("identity")
    Built without OpenSSL support (USE_OPENSSL not set)
    Built without PCRE support (using libc's regex instead)
    Built without Lua support
    Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
    Available polling systems :
          epoll : pref=300,  test result OK
           poll : pref=200,  test result OK
         select : pref=150,  test result OK
    Total: 3 (3 usable), will use epoll.

informações da versão / servidor mongo

    Server OS: Ubuntu trusty
    14.04.1-Ubuntu SMP Tue Sep 1 09:32:55 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

    [email protected]:/home/bluebrick# mongod --version
    db version v3.2.1
    git version: a14d55980c2cdc565d4704a7e3ad37e4e535c1b2
    OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014
    allocator: tcmalloc
    modules: none
    build environment:
        distmod: ubuntu1404
        distarch: x86_64
        target_arch: x86_64

ha arquivo proxy do conf

    root@ip-10-5-198-10:~/tests/pmongo# cat conf.10
    ######################################################################
    global
    ######################################################################
            maxconn 2048
            log /dev/log    local0 
            log /dev/log  local1 debug
            chroot /var/lib/haproxy
            user haproxy
            group haproxy
            debug
    ######################################################################
    defaults
    ######################################################################
            log     global
            mode    tcp
            option tcplog
            timeout connect 5000
            timeout client  50000
            timeout server  50000
    ######################################################################
    frontent
    ######################################################################
            frontend   fe_20_20_mongo_27010_tcp
            bind 10.5.198.10:27010
            mode tcp
            option tcplog
            use_backend    be_20_20_mongo_27010_tcp
    ######################################################################
    backend
    ######################################################################
            backend   be_20_20_mongo_27010_tcp
            mode tcp
            option tcplog
            option             tcpka
            server node1 10.5.20.20:27010 
    ##################################################
    ##################################################

Quando me conecto ao mongo ignorando o haproxy, é assim:

    bluebrick@ip-10-5-198-10:~$ mongo 10.5.20.20:27010 -verbose
    MongoDB shell version: 2.4.10
    Sat Feb 27 13:12:46.776 versionArrayTest passed
    connecting to: 10.5.20.20:27010/test
    Sat Feb 27 13:12:46.798 creating new connection to:10.5.20.20:27010
    Sat Feb 27 13:12:46.799 BackgroundJob starting: ConnectBG
    Sat Feb 27 13:12:46.803 connected connection!
    Server has startup warnings: 
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] 
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] 
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] 
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
    2016-02-27T12:48:57.313-0500 I CONTROL  [initandlisten] 
    rs0:SECONDARY> quit()
    bluebrick@ip-10-5-198-10:~$ 

Quando executo o servidor haproxy, é assim:

    root@ip-10-5-198-10:~/tests/pmongo# haproxy -d -f conf.10
    Available polling systems :
          epoll : pref=300,  test result OK
           poll : pref=200,  test result OK
         select : pref=150,  test result FAILED
    Total: 3 (2 usable), will use epoll.
    Using epoll() as the polling mechanism.
    00000000:fe_20_20_mongo_27010_tcp.accept(0004)=0006 from [10.5.198.10:43177]
    00000000:be_20_20_mongo_27010_tcp.srvcls[0006:0007]
    00000000:be_20_20_mongo_27010_tcp.clicls[0006:0007]
    00000000:be_20_20_mongo_27010_tcp.closed[0006:0007]
    00000001:fe_20_20_mongo_27010_tcp.accept(0004)=0006 from [10.5.198.10:43206]
    00000001:be_20_20_mongo_27010_tcp.srvcls[0006:0007]
    00000001:be_20_20_mongo_27010_tcp.clicls[0006:0007]
    00000001:be_20_20_mongo_27010_tcp.closed[0006:0007]

Quando me conecto ao mongo USANDO o haproxy E QUANDO FUNCIONA, é assim:

    bluebrick@ip-10-5-198-10:~$ mongo 10.5.198.10:27010 -verbose
    mongodb shell version: 2.4.10
    sat feb 27 13:04:00.655 versionarraytest passed
    connecting to: 10.5.198.10:27010/test
    sat feb 27 13:04:00.678 creating new connection to:10.5.198.10:27010
    sat feb 27 13:04:00.678 backgroundjob starting: connectbg
    sat feb 27 13:04:00.678 connected connection!
    server has startup warnings: 
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] ** warning: you are running this process as the root user, which is not recommended.
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] 
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] 
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] ** warning: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] **        we suggest setting it to 'never'
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] 
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] ** warning: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] **        we suggest setting it to 'never'
    2016-02-27t12:48:57.313-0500 i control  [initandlisten] 
    rs0:secondary> quit()

Quando me conecto ao mongo USANDO o haproxy E QUANDO FALHA, é assim:

    bluebrick@ip-10-5-198-10:~$ mongo 10.5.198.10:27010 -verbose
    MongoDB shell version: 2.4.10
    Sat Feb 27 13:04:03.900 versionArrayTest passed
    connecting to: 10.5.198.10:27010/test
    Sat Feb 27 13:04:03.922 creating new connection to:10.5.198.10:27010
    Sat Feb 27 13:04:03.922 BackgroundJob starting: ConnectBG
    Sat Feb 27 13:04:03.922 connected connection!
    Sat Feb 27 13:04:03.923 recv(): message len 1347703880 is too large. Max is 48000000
    Sat Feb 27 13:04:03.923 DBClientCursor::init call() failed
    Sat Feb 27 13:04:03.923 User Assertion: 10276:DBClientBase::findN: transport error: 10.5.198.10:27010 ns: admin.$cmd query: { whatsmyuri: 1 }
    Sat Feb 27 13:04:03.923 Error: DBClientBase::findN: transport error: 10.5.198.10:27010 ns: admin.$cmd query: { whatsmyuri: 1 } at src/mongo/shell/mongo.js:147
    Sat Feb 27 13:04:03.923 User Assertion: 12513:connect failed
    Sat Feb 27 13:04:03.923 freeing 1 uncollected N5mongo20DBClientWithCommandsE objects
    exception: connect failed
    bluebrick@ip-10-5-198-10:~$ 

Olhando para os logs do servidor mongo: Uma boa conexão é assim:

    2016-02-27T12:53:14.944-0500 D STORAGE  [WTJournalFlusher] flushed journal
    2016-02-27T12:53:14.966-0500 I NETWORK  [initandlisten] connection accepted from 10.5.198.10:36447 #30 (9 connections now open)
    2016-02-27T12:53:14.966-0500 D COMMAND  [conn30] run command admin.$cmd { whatsmyuri: 1 }
    2016-02-27T12:53:14.966-0500 I COMMAND  [conn30] command admin.$cmd command: whatsmyuri { whatsmyuri: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:66 locks:{} protocol:op_query 0ms
    2016-02-27T12:53:14.968-0500 D COMMAND  [conn30] run command admin.$cmd { getLog: "startupWarnings" }
    2016-02-27T12:53:14.968-0500 D COMMAND  [conn30] command: getLog
    2016-02-27T12:53:14.968-0500 I COMMAND  [conn30] command admin.$cmd command: getLog { getLog: "startupWarnings" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:949 locks:{} protocol:op_query 0ms
    2016-02-27T12:53:14.981-0500 D COMMAND  [conn30] run command admin.$cmd { replSetGetStatus: 1.0, forShell: 1.0 }
    2016-02-27T12:53:14.981-0500 D COMMAND  [conn30] command: replSetGetStatus
    2016-02-27T12:53:14.981-0500 I COMMAND  [conn30] command admin.$cmd command: replSetGetStatus { replSetGetStatus: 1.0, forShell: 1.0 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:964 locks:{} protocol:op_query 0ms

Olhando para os logs do servidor mongo: Um connction com falha

    There is nothing put in the mongo server logs in this scenario.
    
por Jeanette Brown 27.02.2016 / 20:02

0 respostas