código awk / one-liner

0

a entrada:

21:55:40 9.68 portmap
21:55:40 7.46 Proc2
21:55:40 7.16 java
21:55:40 6.75 java
21:55:40 6.18 Proc1
21:55:40 5.95 Proc2
21:55:40 5.89 java
21:55:40 5.68 proc3
22:00:39 17.06 Proc1
22:00:39 13.00 Proc1
22:00:39 6.90 java
22:00:39 6.36 java
22:00:39 5.51 java
22:05:40 11.02 java
22:05:40 5.98 proc3
22:05:40 5.80 Proc1
22:05:40 5.37 Proc1
22:10:40 10.97 Proc1
22:10:40 10.19 Proc2
22:10:40 5.56 java
22:10:40 5.19 java
22:15:40 18.95 Proc1
22:20:40 31.89 Proc1
22:20:40 8.10 java
22:20:40 7.81 java
22:20:40 6.17 java
22:20:40 5.13 java
22:20:40 5.11 java
22:25:40 7.98 Proc2
22:25:40 6.36 java
22:25:40 5.34 java
22:25:40 5.31 java
22:30:40 8.52 Proc1
22:30:40 8.39 Proc2
22:30:40 7.29 Proc1
22:35:41 5.12 proc3
22:40:41 25.25 Proc2
22:45:40 15.82 Proc2
22:45:40 8.27 Proc1
22:50:41 19.94 Proc1
22:55:41 14.52 Proc1
23:05:41 45.58 Proc1
23:10:41 23.29 Proc2
23:10:41 5.06 java

Eu quero verificar cada vez / medir quão grande carga (soma de cada "tipo") dos processos estava lá (por tipo eu quero dizer javas, Proc1, Proc2, etc) finalmente classificados por tempo e, em seguida, por carga

para que o resultado esperado seja o seguinte:

    $ cut -d" " -f1 input_|sort -u|while read a;do grep ^$a input_|cut -d" " -f3|sort -u|while read v;do echo $a $v $(awk -va=$a -vv=$v '$1==a&&$3==v{sum+=$2}END {print sum}' input_);done|sort -t" " -nrk3;done
21:55:40 java 19.8
21:55:40 Proc2 13.41
21:55:40 portmap 9.68
21:55:40 Proc1 6.18
21:55:40 proc3 5.68
22:00:39 Proc1 30.06
22:00:39 java 18.77
22:05:40 Proc1 11.17
22:05:40 java 11.02
22:05:40 proc3 5.98
22:10:40 Proc1 10.97
22:10:40 java 10.75
22:10:40 Proc2 10.19
22:15:40 Proc1 18.95
22:20:40 java 32.32
22:20:40 Proc1 31.89
22:25:40 java 17.01
22:25:40 Proc2 7.98
22:30:40 Proc1 15.81
22:30:40 Proc2 8.39
22:35:41 proc3 5.12
22:40:41 Proc2 25.25
22:45:40 Proc2 15.82
22:45:40 Proc1 8.27
22:50:41 Proc1 19.94
22:55:41 Proc1 14.52
23:05:41 Proc1 45.58
23:10:41 Proc2 23.29
23:10:41 java 5.06

como fazê-lo funcionar em um comando awk de linha única?

    
por DonJ 07.11.2017 / 16:05

1 resposta

3

gawk '
    { sum[$1][$3] += $2 } 
    END {
        PROCINFO["sorted_in"]="@ind_str_asc";
        for (time in sum) 
            for (process in sum[time]) 
                print time, process, sum[time][process]
        }
' file

Para "one liner", remova as novas linhas no corpo do awk.

Não específico do GNU:

awk '{sum[$1 OFS $3] += $2} END {for (key in sum) print key, sum[key]}' file | sort
    
por 07.11.2017 / 16:20

Tags