novo sistema com erros do pcie precisa de ajuda na depuração

0

Estou recebendo alguns erros e esperava que alguém pudesse ajudar a depurar. O que significa primeiro e segundo qual é o meu caminho para investigar mais etapas de depuração e uma solução completa, se possível.

Executando a placa-mãe Aorus Gaming 7 com uma CPU Threadripper 1950x e Nvidia 1070 com os drivers mais recentes.

Aqui está um link para a pasta

system log
-------------------------
8/23/17 9:30 PM -x399   kernel  [19510.161819] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
8/23/17 9:30 PM -x399   kernel  [19510.161833] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
8/23/17 9:30 PM -x399   kernel  [19510.161837] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
8/23/17 9:30 PM -x399   kernel  [19510.161840] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
8/23/17 9:30 PM -x399   kernel  [19510.161842] pcieport 0000:00:01.1:    [ 6] Bad TLP               
8/23/17 9:31 PM -x399   kernel  [19539.323943] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
8/23/17 9:31 PM -x399   kernel  [19539.323957] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
8/23/17 9:31 PM -x399   kernel  [19539.323961] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
8/23/17 9:31 PM -x399   kernel  [19539.323964] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
8/23/17 9:31 PM -x399   kernel  [19539.323967] pcieport 0000:00:01.1:    [ 6] Bad TLP               
8/23/17 9:42 PM -x399   kernel  [20194.657679] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
8/23/17 9:42 PM -x399   kernel  [20194.657692] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
8/23/17 9:42 PM -x399   kernel  [20194.657696] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
8/23/17 9:42 PM -x399   kernel  [20194.657699] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
8/23/17 9:42 PM -x399   kernel  [20194.657702] pcieport 0000:00:01.1:    [ 6] Bad TLP

lspci output
-------------------------
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1450
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 1451
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 59)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1460
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1461
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1462
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1463
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1464
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1465
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1466
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1467
00:19.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1460
00:19.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1461
00:19.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1462
00:19.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1463
00:19.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1464
00:19.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1465
00:19.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1466
00:19.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1467
01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ba (rev 02)
01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] Device 43b6 (rev 02)
01:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b1 (rev 02)
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
03:00.0 USB controller: ASMedia Technology Inc. Device 1343
04:00.0 Network controller: Intel Corporation Device 24fd (rev 78)
05:00.0 Ethernet controller: Qualcomm Atheros Device e0b1 (rev 10)
07:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a804
08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
08:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Device 1456
08:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c
09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
09:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
09:00.3 Audio device: Advanced Micro Devices, Inc. [AMD] Device 1457
40:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1450
40:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 1451
40:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
40:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
40:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
41:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)
41:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)
42:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
42:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Device 1456
42:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c
43:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
43:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
    
por Goddard 24.08.2017 / 04:21

1 resposta

0

Parece que esse problema acontece em muitas placas-mãe de x99 na Intel para x399 na AMD.

Eu posso dar pelo menos alguns detalhes, mesmo que eu não possa explicar completamente o que acontece.

Eu pensava que o TLP era um problema de energia, mas depois de uma pequena pesquisa descobri que ele realmente significa pacotes de camada de transação (TLPs).

O hardware geralmente detecta pacotes defeituosos, e o kernel Linux reporta isso como mensagens.

A opção do kernel pci = nommconf desabilita o Espaço de Configuração PCI Mapeado na Memória. Você pode adicionar isto editando o grub com este comando.

sudo nano /etc/default/grub

Encontre a variável GRUB_CMDLINE_LINUX_DEFAULT e adicione a linha abaixo nas aspas no final.

pci=nommconf

Meu parecido com isso depois.

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nommconf"

Isso pode ser um bug de hardware nos dispositivos, no controlador ou em algo totalmente diferente.

Embora esta seja uma solução real que resolve os erros e não apenas os suprime, o que, sem muito mais conhecimento técnico, parece uma boa solução. Embora pessoalmente eu procure por mais atualizações de BIOS da placa-mãe junto com as atualizações do kernel e remova temporariamente a alteração para ver se ela está resolvida.

    
por Goddard 30.08.2017 / 23:01