cat vs grep vs awk obtém o conteúdo do arquivo que é mais eficiente e consome menos tempo?

Question

cat vs grep vs awk obtém o conteúdo do arquivo que é mais eficiente e consome menos tempo?

#1 resposta do Sparhawk (2 votos)
#2 resposta do Himanshu Chauhan (0 votos)

1

Digamos que eu tenha conteúdo de arquivo como este:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

Primeiro eu tentei:

time cat temp.txt

Saída:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.001s
user    0m0.000s
sys     0m0.001s

Segundo eu tentei:

time grep "$"  temp.txt

Saída:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.002s
user    0m0.000s
sys     0m0.002s

Terceiro eu tentei:

time awk  "/$/"  temp.txt

Saída:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.004s
user    0m0.001s
sys     0m0.004s

Com:

time awk 1 temp.txt

Saída:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.004s
user    0m0.000s
sys     0m0.003s

com sed:

time sed "" temp.txt

Saída:

this is a simple file for testing purpose
with few lines in it.
to check the cat and grep command to verfy which is best and less excution time consuming

real    0m0.002s
user    0m0.000s
sys     0m0.002s

Isso significa que o comando cat é muito melhor para imprimir todo o conteúdo do arquivo. Como leva menos tempo para a execução?

por snoop 24.12.2014 / 10:16

2 respostas

0

Eu tenho os mesmos scripts. Em um eu estou usando gato e em outro é tudo AWK.

Aqui está o primeiro:

#!/bin/bash


        lines=$(cat /etc/passwd | wc -l)

        for ((i=1 ; i <=$lines ; i++ ))
        do
        user=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print }')
        uid=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print }')
        gid=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print }')
        shell=$(cat /etc/passwd | awk -F : -vi=$i 'NR==i {print }')
        echo -e "User is : $user \t Uid is : $uid \t Gid is : $gid \t Shell is : $shell"
        done

Aqui está o segundo:

#!/bin/bash


        lines=$(awk  'END {print NR}' /etc/passwd)

        for ((i=1 ; i <=$lines ; i++ ))
        do
        user=$(awk  -F : -vi=$i 'NR==i {print }' /etc/passwd)
        uid=$(awk  -F : -vi=$i 'NR==i {print }'  /etc/passwd)
        gid=$(awk  -F : -vi=$i 'NR==i {print }'  /etc/passwd)
        shell=$(awk  -F : -vi=$i 'NR==i {print }' /etc/passwd)
        echo -e "User is : $user \t Uid is : $uid \t Gid is : $gid \t Shell is : $shell"
        done

O tempo gasto para o primeiro script é o seguinte (script com instruções CAT):

real    0m0.215s
user    0m0.023s
sys     0m0.238s

Para o segundo script que tem apenas instruções AWK, o tempo gasto é o seguinte:

real    0m0.132s
user    0m0.013s
sys     0m0.123s

Eu acho que o processamento do arquivo awk é muito mais rápido em comparação com a chamada de outra função externa para ler os arquivos. Eu ficaria feliz por uma discussão sobre os resultados.

Eu acho que o AWK tem um desempenho melhor em alguns casos.

por Himanshu Chauhan 22.07.2016 / 07:34

Como terminar um script iniciado pelo cron? onde estão as configurações de resolução de tela no KDE

score 2 · Accepted Answer

A resposta é "sim". Inicialmente, isso é mais uma afirmação, uma vez que cat está meramente lendo o arquivo, enquanto os outros dois estão procurando por uma expressão. Seus scripts de time são a ideia certa, mas nessas durações extremamente baixas, qualquer variação pequena gerará resultados errôneos. É muito melhor usar um arquivo maior ou repeti-lo várias vezes.

$ time for i in {1..1000}; do cat temp.txt; done
...
real    0m0.762s
user    0m0.060s
sys     0m0.147s

$ time for i in {1..1000}; do grep "$" temp.txt; done
...
real    0m3.128s
user    0m0.667s
sys     0m0.263s

$ time for i in {1..1000}; do awk "/$/" temp.txt; done
...
real    0m3.332s
user    0m0.720s
sys     0m0.337s

Além disso (não mostrado), executei os comandos acima várias vezes para confirmar que cada comando foi executado quase ao mesmo tempo e, portanto, era replicável.

Mais referências

De acordo com os comentários, aqui estão mais alguns comandos que testei. No meu sistema, grep "^" e awk "1" não tiveram aumento apreciável na eficiência, embora sed "" tenha se aproximado de cat .

$ time for i in {1..1000}; do grep "^" temp.txt; done
...
real    0m2.992s
user    0m0.527s
sys     0m0.303s

$ time for i in {1..1000}; do awk "1" temp.txt; done
...
real    0m3.185s
user    0m0.570s
sys     0m0.317s

$ time for i in {1..1000}; do sed "" temp.txt; done
...
real    0m0.983s
user    0m0.077s
sys     0m0.193s