Como processar várias cadeias de caracteres de uma coluna

Question

Como processar várias cadeias de caracteres de uma coluna

#1 resposta do (1 votos)
#2 resposta do (0 votos)
#3 resposta do (0 votos)
#4 resposta do (0 votos)

1

Eu tenho um arquivo separado por vírgula semelhante ao formato dele:

aa.com,1.21.3.4,string1 string2 K=12     K2=23  K3=45 K4=56
bb.com,5.6.7.8,string1 string2 K=66     K2=77  K3=88 K4=99

Eu quero pegar a terceira coluna que contém strings separadas por espaços. Eu quero processar o arquivo para separar as terceiras colunas duas primeiras seqüências de caracteres por uma vírgula e ignorar o restante das seqüências de caracteres na coluna 3. Os dois primeiros campos não contêm espaços. Por favor, observe que o número de strings na terceira coluna não é fixo em todos os registros. Neste exemplo, são 6 strings separadas por 5 espaços. Mas pode ser mais ou menos.

Tudo o que eu preciso é pegar as 3 primeiras colunas primeiro duas strings, separá-las por uma vírgula e ignorar o resto da coluna 3 strings.

aa.com,1.21.3.4,string1,string2
bb.com,5.6.7.8,string1,string2

grep text-processing awk string text-formatting

por user9371654 25.08.2018 / 09:26

4 respostas

0

Tente isto:

awk -F '[, ]' '{print $1","$2","$3","$4}' file
aa.com,1.21.3.4,string1,string2
bb.com,5.6.7.8,string1,string2

por 25.08.2018 / 09:48

0

Você pode fazer isso da seguinte maneira:

sed -ne 's/[[:blank:]]\{1,\}/,/;s//\n/;P' input-file.txt

por 25.08.2018 / 11:51

0

awk -F "[, ]" '{print $1,$2,$3,$4;OFS=","}' file

F "[, ]" Receberá espaço e vírgula como separador de campo e ;OFS="," definirá o separador do campo Saída como vírgula.

por 25.08.2018 / 18:53

Tags grep text-processing awk string text-formatting

área de transferência VNC não está funcionando Links simbólicos relativos não resolvidos no subdiretório

score 1 · Accepted Answer

tente:

awk '{print $1, $2}' OFS=, infile
aa.com,1.21.3.4,string1,string2
bb.com,5.6.7.8,string1,string2

Se, nesse caso, você tivesse espaços em branco no primeiro ou no segundo campo, você faria:

awk -F, '{ match($3, /[^ ]* +[^ ]*/); 
           bkup=substr($3, RSTART, RLENGTH);
           gsub(/ +/, ",", bkup); # replace spaces with comma
           print $1, $2, bkup
}' OFS=, infile

Explicação: leia em homem awk :

match(s, r [, a])  
          Return the position in s where the regular expression r occurs, 
          or 0 if r is not present, and set the values of RSTART and RLENGTH. (...)

substr(s, i [, n])
          Return the at most n-character substring of s starting at I.
          If n is omitted, use the rest of s.

RSTART
          The index of the first character matched by match(); 0 if no
          match.  (This implies that character indices start at one.)

RLENGTH
          The length of the string matched by match(); -1 if no match.