Conteúdo de tmp.txt
A01 11814111 11814112 GA AA
A01 11485477 11485519 AG AT
A01 11667935 11667971 TC TA
A01 11876070 11876079 TC TG
A01 11613258 11613277 AC GC
A01 11876079 11876107 CA GA
A01 11616453 11616463 TA TG
A01 11875367 11875368 GG GA
A01 11667971 11667993 CA AA
A01 11564406 11564411 TA TG
A01 11477215 11477235 TG CG
Conteúdo de tmp.awk
{
if (substr($4,1,1) != substr($5,1,1)) {
print $1 "_" $2 " " substr($4,1,1) " " substr($5,1,1);
}
if (substr($4,2,1) != substr($5,2,1)) {
print $1 "_" $3 " " substr($4,2,1) " " substr($5,2,1);
}
}
Exemplo de saída
[user@server ~]$ awk -f tmp.awk tmp.txt
A01_11814111 G A
A01_11485519 G T
A01_11667971 C A
A01_11876079 C G
A01_11613258 A G
A01_11876079 C G
A01_11616463 A G
A01_11875368 G A
A01_11667971 C A
A01_11564411 A G
A01_11477215 T C
Bônus. Em bash
#!/bin/bash
while read line
do
set $line
if [ ${4:0:1} != ${5:0:1} ]
then printf "$1_$2 ${4:0:1} ${5:0:1}\n"
fi
if [ ${4:1:1} != ${5:1:1} ]
then printf "$1_$3 ${4:1:1} ${5:1:1}\n"
fi
done < tmp.txt
Exemplo de saída
A01_11814111 G A
A01_11485519 G T
A01_11667971 C A
A01_11876079 C G
A01_11613258 A G
A01_11876079 C G
A01_11616463 A G
A01_11875368 G A
A01_11667971 C A
A01_11564411 A G
A01_11477215 T C