comando de substituição de linha parcial

0

Eu tenho um monte de sequências de dna no meu arquivo. Eu gostaria de substituir a linha de cima, por exemplo. >xxx_3-13_00021^gcd com >gcd Alguém tem um comando de pesquisa e substituição para isso?

>xxx_3-13_00021^gcd
TGCCGTTATTACAATCCGGCAGTCCATACGGCAGCTTTTGCCTTACCCCAGTATCTGCAA
GATGCACTGGCTTCACAGCCGTCCTAA
>yyy_3-13_00019^group_3912
ATGGCCGTTTGCGCAAACAGTTACGCGCTCAGCGAGTCTGAAGCCGAAGATATGGCCGAT
TTAACGGCAGTTTTTGTTTTTCTGAAGAACGATTGTGGTTACCAGAACTTACCTAACGGG
CAAATTCGTCGCGCGCTGGTCTTTTTCGCCCAGCAAAACCAGTGGGATCTCAGTAATTAC
GACACCTTCGACATGAAAGCCCTCGGTGAAGACAGCTACCGCGATCTCAGCGGCATTGGC
ATTCCCGTCGCTAAAAAGTGCAAAGCCCTGGCTCGCGATTCCTTAAGCCTGCTTGCCTAC
GTCAAATAA
>zzzz_3-13_00020^cueO
ATGCAACGTCGTGATTTCTTGAAATATTCCGTCGCGCTGGGTGTGGCTTCAGCCTTGCCG
CTGTGGAGCCGCGCAGTATTTGCGGCGGAACGCCCAACGTTACCAATCCCTGATTTGCTC
ACGACCGATGCCCGTAATCGCATTCAGTTAACTATTGGCGCAGGTCAGTCCACCTTTGGC
GGGAAAACCGCAACTACCTGGGGCTATAACGGCAATCTGCTGGGGCCGGCGGTGAAATTA
CAGCGTGGCAAAGCGGTAACGGTTGATATCTACAACCAACTGACGGAAGAGACGACGTTG
CACTGGCACGGGCTGGAAGTACCGGGTGAAGTGGACGGCGGCCCGCAGGGAATTATTCCG
    
por steffen 09.12.2016 / 13:06

1 resposta

2

Você pode usar o seguinte comando sed para isso:

sed -e 's/^>.*\^/>/g'

Explicação:

  • a expressão procura uma string que comece com > e termine com ^
  • e substitui isso por >
  • então, apenas as últimas letras dos IDs são mantidas.

com o seu exemplo:

$ echo ">xxx_3-13_00021^gcd
TGCCGTTATTACAATCCGGCAGTCCATACGGCAGCTTTTGCCTTACCCCAGTATCTGCAA
GATGCACTGGCTTCACAGCCGTCCTAA
>yyy_3-13_00019^group_3912
ATGGCCGTTTGCGCAAACAGTTACGCGCTCAGCGAGTCTGAAGCCGAAGATATGGCCGAT
TTAACGGCAGTTTTTGTTTTTCTGAAGAACGATTGTGGTTACCAGAACTTACCTAACGGG
CAAATTCGTCGCGCGCTGGTCTTTTTCGCCCAGCAAAACCAGTGGGATCTCAGTAATTAC
GACACCTTCGACATGAAAGCCCTCGGTGAAGACAGCTACCGCGATCTCAGCGGCATTGGC
ATTCCCGTCGCTAAAAAGTGCAAAGCCCTGGCTCGCGATTCCTTAAGCCTGCTTGCCTAC
GTCAAATAA
>zzzz_3-13_00020^cueO
ATGCAACGTCGTGATTTCTTGAAATATTCCGTCGCGCTGGGTGTGGCTTCAGCCTTGCCG
CTGTGGAGCCGCGCAGTATTTGCGGCGGAACGCCCAACGTTACCAATCCCTGATTTGCTC
ACGACCGATGCCCGTAATCGCATTCAGTTAACTATTGGCGCAGGTCAGTCCACCTTTGGC
GGGAAAACCGCAACTACCTGGGGCTATAACGGCAATCTGCTGGGGCCGGCGGTGAAATTA
CAGCGTGGCAAAGCGGTAACGGTTGATATCTACAACCAACTGACGGAAGAGACGACGTTG
CACTGGCACGGGCTGGAAGTACCGGGTGAAGTGGACGGCGGCCCGCAGGGAATTATTCCG" | sed -e 's/^>.*\^/>/g'


>gcd
TGCCGTTATTACAATCCGGCAGTCCATACGGCAGCTTTTGCCTTACCCCAGTATCTGCAA
GATGCACTGGCTTCACAGCCGTCCTAA
>group_3912
ATGGCCGTTTGCGCAAACAGTTACGCGCTCAGCGAGTCTGAAGCCGAAGATATGGCCGAT
TTAACGGCAGTTTTTGTTTTTCTGAAGAACGATTGTGGTTACCAGAACTTACCTAACGGG
CAAATTCGTCGCGCGCTGGTCTTTTTCGCCCAGCAAAACCAGTGGGATCTCAGTAATTAC
GACACCTTCGACATGAAAGCCCTCGGTGAAGACAGCTACCGCGATCTCAGCGGCATTGGC
ATTCCCGTCGCTAAAAAGTGCAAAGCCCTGGCTCGCGATTCCTTAAGCCTGCTTGCCTAC
GTCAAATAA
>cueO
ATGCAACGTCGTGATTTCTTGAAATATTCCGTCGCGCTGGGTGTGGCTTCAGCCTTGCCG
CTGTGGAGCCGCGCAGTATTTGCGGCGGAACGCCCAACGTTACCAATCCCTGATTTGCTC
ACGACCGATGCCCGTAATCGCATTCAGTTAACTATTGGCGCAGGTCAGTCCACCTTTGGC
GGGAAAACCGCAACTACCTGGGGCTATAACGGCAATCTGCTGGGGCCGGCGGTGAAATTA
CAGCGTGGCAAAGCGGTAACGGTTGATATCTACAACCAACTGACGGAAGAGACGACGTTG
CACTGGCACGGGCTGGAAGTACCGGGTGAAGTGGACGGCGGCCCGCAGGGAATTATTCCG
    
por Wayne_Yux 09.12.2016 / 13:13