Isso pode ser por causa do layout descrito neste artigo da Wikipédia: row hammer .
In dynamic RAM (DRAM), each bit of stored data occupies a separate memory cell […]
Memory cells (blue squares in the illustration) are further organized into matrices and addressed through rows and columns. A memory address applied to a matrix is broken into the row address and column address, which are processed by the row and column address decoders (in the illustration, vertical and horizontal green rectangles, respectively). After a row address selects the row for a read operation (the selection is also known as row activation), bits from all cells in the row are transferred into the sense amplifiers that form the row buffer (red squares in the illustration), from which the exact bit is selected using the column address.
Hipótese: há algo errado com um bit de um certo buffer de linha (quadrados vermelhos); isso afeta a leitura de qualquer linha correspondente ao buffer. Eu não estou afirmando que isso acontecerá toda vez, para cada linha e qualquer dado escrito; No entanto, acredito que este formulário de matriz e o buffer de linha (ou algo semelhante) tenham algo a ver com o fato de que é sempre o último bit do primeiro byte que falhou.
Secondly, the memtest only failed a few tests, specifically it did not fail tests 0 through 2 […]. I am surprised that the first few tests never resulted in an error. Any reason why?
Este manual explica quais são os testes:
- Test 0 [Address test, walking ones, no cache]
Tests all address bits in all memory banks by using a walking ones address pattern.
- Test 1 [Address test, own address, Sequential]
Each address is written with its own address and then is checked for consistency. In theory previous tests should have caught any memory addressing problems. This test should catch any addressing errors that somehow were not previously detected. This test is done sequentially with each available CPU.
- Test 2 [Address test, own address, Parallel]
Same as test 1 but the testing is done in parallel using all CPUs and using overlapping addresses.
Eu entendo que os testes 0-2 são projetados para capturar erros de endereçamento, não necessariamente erros de dados reais. Observe se eles foram capazes de detectar todos os erros, então testes adicionais não seriam necessários.