Bash: combinação da composição dos vários arquivos de log com base no padrão de pesquisa

0

Eu tenho uma pasta com muitos dos arquivos txt. Onde cada arquivo está presente no seguinte formato:

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

Meu objetivo: é fazer o loop dos arquivos dentro da pasta e combiná-los dentro da saída global. Notável no exemplo, quero considerar apenas as strings após (e incluindo) os contatos "19 (esse número é diferente em cada arquivo)", pulando assim as seis primeiras linhas do arquivo.

Possível fluxo de trabalho para a realização:

# make a log file which will contain info from all files going to be looped on the next step.
echo "This is a beginning of the global output" > ./final_output.txt
# that is a key phrase which is the indicator of the first string which should be taken from each of the files
key= "#any of the digit# contacts" 

#now I want to loop each of the files with the aim to add all of the strings after (and including) ${key} to the final_output.txt
for file in ${folder}/*.txt; do
  file_title=$(basename "$file")
  # 1- print the ${file_title} within the final_output.txt
  # 2 -  add all of the strings from the file into the final_output.txt
  # NB ! I need to take only the strings after (and including) the key-phrace

done
    
por user3470313 07.12.2017 / 15:50

2 respostas

0

Tomemos o exemplo de 3 arquivos

arquivo1

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

arquivo3

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

17 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

file4

Allowed overlap: -3
H-bond overlap reduction: 0.4
Ignore contacts between atoms separated by 4 bonds or less
Detect intra-residue contacts: False
Detect intra-molecule contacts: False

12 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335

Abaixo está o código que salvará a saída de 19contacts, 17contacts, 12 contatos até o final do arquivo

 for i in file1 file3 file4; do sed -n '/^[0-9]/,$p'  $i; done > /var/tmp/outputfile.txt

saída

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
17 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
12 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
    
por 07.12.2017 / 16:17
0

Eu encontrei outro método com os mesmos arquivos de entrada

Código:

 for i in file1 file3 file4; do sed '1,6d'  $i; done > /var/tmp/outputfile.txt

Saída

19 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
17 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
12 contacts
atom1  atom2  overlap  distance
:128.B@BB  :300.C@BB  -1.676  4.996
:179.B@BB  :17.C@BB   -1.898  5.218
:182.B@BB  :17.C@BB   -2.015  5.335
    
por 07.12.2017 / 16:25