Desempenho de E / S do Benchmarking Linux

Question

Desempenho de E / S do Benchmarking Linux

#1 resposta do (11 votos)
#2 resposta do (4 votos)
#3 resposta do (2 votos)
#4 resposta do (1 votos)
#5 resposta do (1 votos)
#6 resposta do (0 votos)

4

Acabei de montar uma SAN de pobre homem com o iSCSI e quero fazer um benchmark. Quais são alguns bons benchmarks de desempenho de I / O para Linux além:

hdparm -tT /dev/sda1

Como obtenho uma medição de IOPS?

Obrigado.

io benchmark

por Justin 07.01.2011 / 08:18

6 respostas

11

Pelo menos no Linux, todas as respostas de benchmarking sintéticas devem mencionar fio - é realmente um gerador de E / S de canivete suíço .

Um breve resumo de suas capacidades:

Pode gerar E / S para dispositivos ou arquivos
Enviando E / S usando vários métodos diferentes
- Sincronizar, psync, vsync
- Nativo / posix aio, mmap, splice
E / S de filas até uma profundidade especificada
Especificando o tamanho de E / S enviado em
Especificando o tipo de E / S
- Sequencial / aleatório
  - Se a E / S for aleatória, você pode especificar em qual distribuição deseja distorcê-la para ser mais realista
- Lê / escreve ou alguma mistura dos dois
- E / S gravada com blktrace pode ser reproduzida

As estatísticas que você dá cobrem

Quantidade de E / S gerada em MBytes
Largura de banda média
Latência de envio / conclusão com mínimo, máximo, média e desvio padrão
IOPS
Profundidade média da fila

A lista de recursos e saída continua e continua.

Ele não gerará um número unificado que represente tudo no final, mas se você levar a sério a compreensão do desempenho do armazenamento, saberá que esse número não pode explicar tudo o que você precisa saber. Mesmo Linus Torvalds acha que o fio é bom:

[G]et Jens' FIO code. It does things right [...] Anything else is suspect - forget about bonnie or other traditional tools.

Brendan Gregg (Engenheiro de Performance da Netflix) também mencionou o fio positivo :

My other favorite benchmarks are fio by @axboe [...]

por 27.12.2013 / 18:09

2

Posso sugerir que você leia estes dois posts / artigos:

link link

Em particular:

First, I would suggest you using a more accurate and controllable tool to test performance. hdparm was designed to change IDE device parameters and the test it does is quite basic. You can't also tell what's is going on when using hdparm on compound devices on LVM and iSCSI. Also, hdparm does not test write speed, which is not related with read speed as there are different optimizations for both (write back caches, read ahead and prefetching algorithms, etc).

I prefer to use the old&good dd command which allows you to fine control block sizes, length of tests and use of the buffer-cache. It also gives you a nice and short report on transfer rate. You can also choose to test buffer-cache performance.

Also, do realize that there are several layers involved here, including the filesystem. hdparm only tests access to the RAW device.

TEST COMMANDS I suggest using the following commands for tests:

a) For raw devices, partitions, LVM volumes, software RAIDs, iSCSI LUNs (initiator side). Block size of 1M is OK to test bulk transfer speed for most modern devices. For TPS tests, please use small sizes like 4k. Change count to make a more realistic test (I suggest long test to test sustained rate against transitory interferences). "odirect" flag avoids using buffer-cache, so the test results should be repeatable.

Read test: dd if=/dev/zero of=/dev/ bs=1M count=1024 oflag=direct Write test: dd if=/dev/ of=/dev/null bs=1M count=1024 iflag=direct

Example output for dd with 512x1M blocks: 536870912 bytes (537 MB) copied, 10.1154 s, 53.1 MB/s

The WRITE test is DESTRUCTIVE!!!!!! You should do it BEFORE CREATING FILESYSTEM ON THE DEVICE!!!! On raw devices, beware that the partition table will be erased. You should force the kernel to reread the partition table on that case to avoid problems (with fdisk). However, performance on the whole device and on a single partition should be the same.

b) For filesystem, just change the device for a file name under the mount point. Read test: dd if=/dev/zero of=/mount-point/test.dat bs=1M count=1024 oflag=direct Write test: dd if=/mount-point/test.dat of=/dev/null bs=1M count=1024 iflag=direct

Note that even accessing a file, we are not using the buffer-cache.

c) For the network, just test raw TCP sockets on both directions between servers. Beware of the firewall blocking TCP port 5001.

server1# dd if=/dev/zero bs=1M count=1024 | netcat 5001 server2# netcat -l -p 5001 | dd of=/dev/null

TEST LAYERS Now you have a tool to test disk performance for each layer. Just follow this sequence:

a) Test local disk performance on iSCSI servers. b) Test network TCP performance between iSCSI targets and initiators. c) Test disk performance on iSCSI LUNs on iSCSI initiator (this is the final raw performance of iSCSI protocol). d) Test performance on LVM logical volume. e) Test performance on large files on top of filesystem.

There should be a large performance gap between the layer being responsible for the loss and the following layer. But I don't think this is LVM. I suspect of the filesystem layer.

Now some tips for possible problems:

a) You didn't describe if you defined a stripped LVM volume on iSCSI LUNs. Stripping could create a bottleneck if synchronous writing were used on iSCSI targets (see issue with atime below). Remember that default iSCSI target behaviour is synchronous write (no RAM caching). b) You didn't describe the kind of access pattern to your files: -Long sequential transfers of large amounts of data (100s of MB)? -Sequences of small block random accesses? -Many small files?

I may be wrong, but I suspect that your system could be suffering the effects of the "ATIME" issue. The "atime" issue is a consequence of "original ideas about Linux kernel design", which we suffer in the last years because of people eager to participate in the design of an OS which is not familiar with performance and implications of design decisions.

Just in a few words. For almost 40 years, UNIX has updated the "last access time" of an inode each time a single read/write operation is done on its file. The buffer cache holds data updates which don't propagate to disk for a while. However, in Linux design, each update to inode's ATIME has to be updated SYNCHRONOUSLY AND INMEDIATELY to disk. Just realize the implications of interleaving sync. transfers in a stream of operations on top of iSCSI protocol.

To check if this applies, just do this test: -Read a long file (at least 30 seconds) without using the cache. Of course with dd!!! -At the same time, monitor the I/O with "iostat -k 5".

If you observe a small, but continuous flow of write operations while reading data, it could be the inode updates.

Solution: The thing is becoming so weird with Linux that they have added a mount option to some filesystems (XFS, EXT3, etc) to disable the update of atime. Of course that makes filesystems semantics different from the POSIX standards. Some applications observing last access time of files could fail (mostly email readers and servers like pine, elm, Cyrus, etc). Just remount your file system with options "noatime,nodiratime". There is also a "norelatime" on recent distributions which reduces obsolescence in "atime" for inodes.

Please, drop a note about results of these tests and the result of your investigation.

por 07.01.2011 / 08:26

1

Depende da finalidade do disco. A maneira mais rápida e mais simples é "dd" como o tmow mencionou, mas eu recomendaria adicionalmente o iozone e o orion .

IOzone na minha opinião é mais preciso no benchmarking de arquivos do que o bonnie ++
O Orion ("ORacle IO Numbers" da Oracle) é muito escalável e pode fazer benchmark de maneira adequada mesmo com armazenamento muito grande / poderoso, e acho muito útil dimensionar o armazenamento para bancos de dados. (Coleto os resultados de orion de diferentes matrizes de disco, controladores de disco e configurações de raid e, em seguida, compara-os)

por 07.01.2011 / 10:27

1

Comparação de desempenho do disco:

$ dd if=/dev/zero of=/tmp/zero bs=1k count=100k

Comparação de redes:

$ yes | pv | ssh $host "cat > /dev/null"

por 07.01.2011 / 13:03

0

Além dos outros, você também pode usar o benchmark de postagem

por 07.01.2011 / 11:46

Tags io benchmark

Por que os provedores de nuvem calculam por hora? Quem desliga seus servidores de qualquer maneira? O que é uma maneira conveniente de executar consultas de múltiplas linhas em postgres usando ssh?

score 4 · Accepted Answer

Eu recomendaria usar o bonnie ++ para testes de desempenho de disco. É feito especificamente para fazer esse tipo de coisa.