Os módulos do kernel não serão carregados para o NFS / RoCE Ubuntu 16.04 com os drivers / softwares mais recentes

1

Estou tendo problemas com o NFS sobre o RoCE no Ubuntu 16.04 usando o pacote OFED mais recente, desde que seja Mellanox ( MLNX_OFED_LINUX-3.3-1.0.4.0-ubuntu16.04-x86_64.tgz ). Minhas placas são Mellanox 10Gbe e estão ativadas para RoCE v1.

Funciona com drivers / software da Inbox, mas não com o OFED mais recente

Consegui fazer com que o NFS trabalhasse com o RoCE seguindo os documentos neste site usando os drivers / software da Inbox (incluídos no Ubuntu 16.04). Eu estava tendo alguns pequenos problemas e eu sei que as coisas do Ubuntu estão desatualizadas, então eu queria instalar os drivers OFED / mlx4 mais recentes, etc ... de acordo com as recomendações do mellanox.com. Então eu fiz isso. Tudo correu como planejado. A funcionalidade IP está toda lá e todas as ferramentas / testes RDMA funcionam. Tudo parece funcionar muito bem. Exceto uma coisa.

Os módulos svcrdma e xprtrdma não serão carregados. Portanto, não há suporte a RDMA para NFS para mim. Eu recebo os seguintes erros. Eu também recebo o mesmo erro se eu instalar apenas o driver mlx4 mais recente do site da Mellanox e deixar o restante dos pacotes sozinho.

Eu tenho a sensação de que isso pode ser resolvido de alguma forma - como recompilar módulos do kernel e tal, mas isso está acima da minha cabeça no momento. Ou talvez eu apenas estraguei algo (cruzando os dedos)? Alguém pode ajudar?

Alguém comentou sobre este artigo da comunidade Mellanox que eles tiveram o mesmo problema com o Ubuntu 14.04 : link De acordo com o mesmo documento, ele deve funcionar bem com o CentOS 7. Qual é a diferença?

O resultado final que eu quero é ter o mais recente driver e software (de preferência) trabalhando no Ubuntu 16.04 com NFS sobre RoCE. Se não for o mais recente pacote OFED, pelo menos o mais recente driver de mlx4. Eu li em algum lugar que versões mais recentes do kernel terão drivers atualizados e código RDMA (eu esqueci a maior parte do que eu li). Se isso não der em nada, minha resposta talvez seja esperar por uma versão mais recente do Ubuntu.

Obrigado

Mensagens de erro ao carregar módulos

Servidor NFS:

# modprobe svcrdma 
modprobe: ERROR: could not insert 'rpcrdma': Invalid argument 

erros dmesg:

[105699.696980] rpcrdma: Unknown symbol rdma_event_msg (err 0) 
[105699.697056] rpcrdma: disagrees about version of symbol ib_create_cq 
[105699.697059] rpcrdma: Unknown symbol ib_create_cq (err -22) 
[105699.697069] rpcrdma: disagrees about version of symbol rdma_resolve_addr 
[105699.697071] rpcrdma: Unknown symbol rdma_resolve_addr (err -22) 
[105699.697183] rpcrdma: Unknown symbol ib_event_msg (err 0) 
[105699.697213] rpcrdma: disagrees about version of symbol ib_dereg_mr 
[105699.697215] rpcrdma: Unknown symbol ib_dereg_mr (err -22) 
[105699.697224] rpcrdma: disagrees about version of symbol ib_query_qp 
[105699.697226] rpcrdma: Unknown symbol ib_query_qp (err -22) 
[105699.697236] rpcrdma: disagrees about version of symbol rdma_disconnect 
[105699.697238] rpcrdma: Unknown symbol rdma_disconnect (err -22) 
[105699.697245] rpcrdma: disagrees about version of symbol ib_alloc_fmr 
[105699.697247] rpcrdma: Unknown symbol ib_alloc_fmr (err -22) 
[105699.697294] rpcrdma: disagrees about version of symbol ib_dealloc_fmr 
[105699.697295] rpcrdma: Unknown symbol ib_dealloc_fmr (err -22) 
[105699.697301] rpcrdma: disagrees about version of symbol rdma_resolve_route 
[105699.697303] rpcrdma: Unknown symbol rdma_resolve_route (err -22) 
[105699.697398] rpcrdma: disagrees about version of symbol rdma_bind_addr 
[105699.697400] rpcrdma: Unknown symbol rdma_bind_addr (err -22) 
[105699.697441] rpcrdma: disagrees about version of symbol rdma_create_qp 
[105699.697443] rpcrdma: Unknown symbol rdma_create_qp (err -22) 
[105699.697479] rpcrdma: Unknown symbol ib_map_mr_sg (err 0) 
[105699.697487] rpcrdma: disagrees about version of symbol ib_destroy_cq 
[105699.697489] rpcrdma: Unknown symbol ib_destroy_cq (err -22) 
[105699.697494] rpcrdma: disagrees about version of symbol rdma_create_id 
[105699.697496] rpcrdma: Unknown symbol rdma_create_id (err -22) 
[105699.697582] rpcrdma: disagrees about version of symbol rdma_listen 
[105699.697584] rpcrdma: Unknown symbol rdma_listen (err -22) 
[105699.697587] rpcrdma: disagrees about version of symbol rdma_destroy_qp 
[105699.697589] rpcrdma: Unknown symbol rdma_destroy_qp (err -22) 
[105699.697597] rpcrdma: disagrees about version of symbol ib_query_device 
[105699.697599] rpcrdma: Unknown symbol ib_query_device (err -22) 
[105699.697606] rpcrdma: disagrees about version of symbol ib_get_dma_mr 
[105699.697607] rpcrdma: Unknown symbol ib_get_dma_mr (err -22) 
[105699.697617] rpcrdma: disagrees about version of symbol ib_alloc_pd 
[105699.697618] rpcrdma: Unknown symbol ib_alloc_pd (err -22) 
[105699.697673] rpcrdma: Unknown symbol ib_alloc_mr (err 0) 
[105699.697734] rpcrdma: disagrees about version of symbol rdma_connect 
[105699.697736] rpcrdma: Unknown symbol rdma_connect (err -22) 
[105699.697769] rpcrdma: Unknown symbol ib_wc_status_msg (err 0) 
[105699.697842] rpcrdma: disagrees about version of symbol rdma_destroy_id 
[105699.697844] rpcrdma: Unknown symbol rdma_destroy_id (err -22) 
[105699.697872] rpcrdma: disagrees about version of symbol rdma_accept 
[105699.697874] rpcrdma: Unknown symbol rdma_accept (err -22) 
[105699.697882] rpcrdma: disagrees about version of symbol ib_destroy_qp 
[105699.697883] rpcrdma: Unknown symbol ib_destroy_qp (err -22) 
[105699.697964] rpcrdma: disagrees about version of symbol ib_dealloc_pd 
[105699.697965] rpcrdma: Unknown symbol ib_dealloc_pd (err -22)

Cliente NFS:

# modprobe xprtrdma          
modprobe: ERROR: could not insert 'rpcrdma': Invalid argument

erros dmesg:

[106055.692454] rpcrdma: Unknown symbol rdma_event_msg (err 0) 
[106055.692480] rpcrdma: disagrees about version of symbol ib_create_cq 
[106055.692481] rpcrdma: Unknown symbol ib_create_cq (err -22) 
[106055.692484] rpcrdma: disagrees about version of symbol rdma_resolve_addr 
[106055.692485] rpcrdma: Unknown symbol rdma_resolve_addr (err -22) 
[106055.692520] rpcrdma: Unknown symbol ib_event_msg (err 0) 
[106055.692529] rpcrdma: disagrees about version of symbol ib_dereg_mr 
[106055.692530] rpcrdma: Unknown symbol ib_dereg_mr (err -22) 
[106055.692532] rpcrdma: disagrees about version of symbol ib_query_qp 
[106055.692533] rpcrdma: Unknown symbol ib_query_qp (err -22) 
[106055.692536] rpcrdma: disagrees about version of symbol rdma_disconnect 
[106055.692536] rpcrdma: Unknown symbol rdma_disconnect (err -22) 
[106055.692538] rpcrdma: disagrees about version of symbol ib_alloc_fmr 
[106055.692539] rpcrdma: Unknown symbol ib_alloc_fmr (err -22) 
[106055.692552] rpcrdma: disagrees about version of symbol ib_dealloc_fmr 
[106055.692553] rpcrdma: Unknown symbol ib_dealloc_fmr (err -22) 
[106055.692554] rpcrdma: disagrees about version of symbol rdma_resolve_route 
[106055.692555] rpcrdma: Unknown symbol rdma_resolve_route (err -22) 
[106055.692565] rpcrdma: disagrees about version of symbol rdma_bind_addr 
[106055.692565] rpcrdma: Unknown symbol rdma_bind_addr (err -22) 
[106055.692573] rpcrdma: disagrees about version of symbol rdma_create_qp 
[106055.692574] rpcrdma: Unknown symbol rdma_create_qp (err -22) 
[106055.692583] rpcrdma: Unknown symbol ib_map_mr_sg (err 0) 
[106055.692585] rpcrdma: disagrees about version of symbol ib_destroy_cq 
[106055.692585] rpcrdma: Unknown symbol ib_destroy_cq (err -22) 
[106055.692587] rpcrdma: disagrees about version of symbol rdma_create_id 
[106055.692587] rpcrdma: Unknown symbol rdma_create_id (err -22) 
[106055.692613] rpcrdma: disagrees about version of symbol rdma_listen 
[106055.692614] rpcrdma: Unknown symbol rdma_listen (err -22) 
[106055.692615] rpcrdma: disagrees about version of symbol rdma_destroy_qp 
[106055.692615] rpcrdma: Unknown symbol rdma_destroy_qp (err -22) 
[106055.692617] rpcrdma: disagrees about version of symbol ib_query_device 
[106055.692618] rpcrdma: Unknown symbol ib_query_device (err -22) 
[106055.692619] rpcrdma: disagrees about version of symbol ib_get_dma_mr 
[106055.692620] rpcrdma: Unknown symbol ib_get_dma_mr (err -22) 
[106055.692622] rpcrdma: disagrees about version of symbol ib_alloc_pd 
[106055.692623] rpcrdma: Unknown symbol ib_alloc_pd (err -22) 
[106055.692638] rpcrdma: Unknown symbol ib_alloc_mr (err 0) 
[106055.692657] rpcrdma: disagrees about version of symbol rdma_connect 
[106055.692658] rpcrdma: Unknown symbol rdma_connect (err -22) 
[106055.692668] rpcrdma: Unknown symbol ib_wc_status_msg (err 0) 
[106055.692690] rpcrdma: disagrees about version of symbol rdma_destroy_id 
[106055.692690] rpcrdma: Unknown symbol rdma_destroy_id (err -22) 
[106055.692698] rpcrdma: disagrees about version of symbol rdma_accept 
[106055.692699] rpcrdma: Unknown symbol rdma_accept (err -22) 
[106055.692701] rpcrdma: disagrees about version of symbol ib_destroy_qp 
[106055.692701] rpcrdma: Unknown symbol ib_destroy_qp (err -22) 
[106055.692724] rpcrdma: disagrees about version of symbol ib_dealloc_pd 
[106055.692725] rpcrdma: Unknown symbol ib_dealloc_pd (err -22)
    
por Ryan Babchishin 14.09.2016 / 14:51

0 respostas