procura por um padrão e sempre imprime a primeira linha que contém o cn

0

Eu tenho um arquivo com a seguinte saída:

O dn: can pode ter mais rdcPositions.

Eu só preciso do dn: que tem um rdcPositions contém acme # 6 #

O resultado deve imprimir o cn e também o rdcPosition

dn: cn=00fa69bd-bede-4918-a017-b59b0901bb3d,ou=Named,ou=Identities,ou=Active,o
 u=Vault,o=acme
rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>8946
 702990</cn><reqdate>1529318977</reqdate><startdate>1529318977</startdate><end
 date>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</ne
 wstatus><date>1529318977</date></change><change><date>1529319116</date><previ
 ousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>15
 29481285</date><previousstatus>3</previousstatus><newstatus>6</newstatus></ch
 ange></lifecycle></position>

dn: cn=010903cd-e92d-4307-bffc-4921379153c0,ou=Named,ou=Identities,ou=Active,o
 u=Vault,o=acme
rdcPosition: cn=922445,ou=Entities,ou=Active,ou=Vault,o=acme#5#<position><cn>42
 79084890</cn><reqdate>1429014997</reqdate><startdate>1429014997</startdate><e
 nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</
 newstatus><date>1429014997</date></change><change><date>1429023084</date><pre
 viousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>
 1525107741</date><previousstatus>3</previousstatus><newstatus>6</newstatus></
 change><change><date>1525126716</date><previousstatus>6</previousstatus><news
 tatus>5</newstatus></change></lifecycle></position>
rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>97
 26910833</cn><reqdate>1528120494</reqdate><startdate>1528120494</startdate><e
 nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</
 newstatus><date>1528120494</date></change><change><date>1528123478</date><pre
 viousstatus>1</previousstatus><newstatus>3</newstatus></change></lifecycle></
 position>

dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,o
 u=Vault,o=acme
rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>20
 68839799</cn><reqdate>1406284665</reqdate><startdate>1406284665</startdate><e
 nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</
 newstatus><date>1406284665</date></change><change><date>1406284666</date><pre
 viousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>
 1435847283</date><previousstatus>3</previousstatus><newstatus>6</newstatus></
 change></lifecycle></position>
rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>76
 03071057</cn><reqdate>1400325753</reqdate><startdate>1400325753</startdate><e
 nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</
 newstatus><date>1400325753</date></change><change><date>1400325754</date><pre
 viousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>
 1449224475</date><previousstatus>3</previousstatus><newstatus>6</newstatus></
 change></lifecycle></position>
rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>2802
 042129</cn><reqdate>1406284761</reqdate><startdate>1406284761</startdate><end
 date>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</ne
 wstatus><date>1406284761</date></change><change><date>1406284762</date><previ
 ousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>14
 49224599</date><previousstatus>3</previousstatus><newstatus>6</newstatus></ch
 ange></lifecycle></position>
rdcPosition: cn=312936,ou=Entities,ou=Active,ou=Vault,o=acme#3#<position><cn>19
 23461515</cn><reqdate>1449217172</reqdate><startdate>1449217172</startdate><e
 nddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</
 newstatus><date>1449217172</date></change><change><date>1449225081</date><pre
 viousstatus>1</previousstatus><newstatus>3</newstatus></change></lifecycle></
 position>
    
por Dirk 05.08.2018 / 12:52

3 respostas

1

A entrada parece ser LDIF, conforme especificado em RFC 2849 .

Eu recomendo strongmente que não use a cadeia usual de ferramentas awk / sed / grep para processar o LDIF pelas seguintes razões:

  • Linhas de valores de atributos longos (incluindo dn :) são agrupadas com um único espaço indicando a continuação da linha.
  • Os valores de atributo que contêm caracteres não ASCII serão codificados na base64.

A melhor solução é usar um analisador LDIF decente para sua linguagem de script favorita.

Por exemplo para Python use o módulo ldif em python-ldap:

Consulte os documentos: ldif - analisador e gerador de LDIF

    
por 06.08.2018 / 17:07
0

Sua saída desejada não está clara. Até onde isso te levaria:

    awk  '
            {while (match($0, /rdcPosition: [^ ]*acme#6#[^ ]*/))    {print substr ($0, RSTART, RLENGTH)
                                                                     $0  = substr ($0, RSTART + RLENGTH);
                                                                    }
            }
    ' file 
rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#8946
rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#97
rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#20
rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#76
rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#2802

Para o seu pedido alterado no seu comentário, até onde isso o levaria? Se não estiver satisfeito, por favor, torne-se mais específico para definir o resultado desejado.

awk  '
        {DN = $1 FS $2
         while (match($0, /rdcPosition: [^ ]*acme#6#[^ ]*/))    {print DN, substr ($0, RSTART, RLENGTH)
                                                                 $0  =     substr ($0, RSTART + RLENGTH);
                                                                }
        }
' file
dn: cn=00fa69bd-bede-4918-a017-b59b0901bb3d,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#8946
dn: cn=010903cd-e92d-4307-bffc-4921379153c0,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#97
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#20
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#76
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#2802
    
por 05.08.2018 / 13:42
0

Usando o seguinte script sed (pressupõe que estamos executando com sed -n ):

/^dn:/{                     # this is a "dn" line
    N;                      # append the next line
    s/\n //;                # remove the newline and the space
    x;                      # exchange pattern space with hold space
    /o=acme#6#/p;           # print if pattern space contains our string
    d;                      # delete from pattern space, start next cycle
}
/^rdcPosition:/{            # this is a "rdcPosition" line
    :again;                 # define label for loop
    N;                      # append the next line
    s/\n //;                # remove the newline and the space
    \#</position>#!b again; # if the end tag "</position>" was not read, loop
    /o=acme#6#/H;           # append to hold space if matching what we're looking for
}
${                  # at the very end of input
    x;              # exchange pattern and hold space
    /o=acme#6#/p;   # print if pattern space contains our string
}

O que o script faz é essencialmente criar uma cadeia no sed "espaço reservado" (buffer de finalidade geral que sobrevive entre os ciclos). A string começará com a linha dn e, em seguida, será anexada com as linhas rdcPosition que contêm a string específica em que estamos interessados.

Sempre que uma nova linha dn for encontrada, ou quando estivermos no final da entrada, o espaço de espera será condicionalmente impresso se contiver nossa sequência (ela não poderá conter se nenhuma das linhas rdcPosition para a atual linha dn correspondida).

Teste:

$ sed -n -f script.sed file
dn: cn=00fa69bd-bede-4918-a017b59b0901bb3d,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme
rdcPosition: cn=1950,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>8946702990</cn><reqdate>1529318977</reqdate><startdate>1529318977</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</newstatus><date>1529318977</date></change><change><date>1529319116</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1529481285</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
dn: cn=010903cd-e92d-4307-bffc-4921379153c0,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme
rdcPosition: cn=311982,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>9726910833</cn><reqdate>1528120494</reqdate><startdate>1528120494</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>1</newstatus><date>1528120494</date></change><change><date>1528123478</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change></lifecycle></position>
dn: cn=01126aa4-af80-401b-8713-29e360868999,ou=Named,ou=Identities,ou=Active,ou=Vault,o=acme
rdcPosition: cn=914570,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>2068839799</cn><reqdate>1406284665</reqdate><startdate>1406284665</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</newstatus><date>1406284665</date></change><change><date>1406284666</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1435847283</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
rdcPosition: cn=999546,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>7603071057</cn><reqdate>1400325753</reqdate><startdate>1400325753</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</newstatus><date>1400325753</date></change><change><date>1400325754</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1449224475</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>
rdcPosition: cn=3513,ou=Entities,ou=Active,ou=Vault,o=acme#6#<position><cn>2802042129</cn><reqdate>1406284761</reqdate><startdate>1406284761</startdate><enddate>1924902000</enddate><lifecycle><change><previousstatus/><newstatus>0</newstatus><date>1406284761</date></change><change><date>1406284762</date><previousstatus>1</previousstatus><newstatus>3</newstatus></change><change><date>1449224599</date><previousstatus>3</previousstatus><newstatus>6</newstatus></change></lifecycle></position>

Um script awk logicamente equivalente que produz a mesma saída que o código sed acima:

/^dn:/  {
    if (hold ~ "o=acme#6#")
        print hold

    hold = $0;
    getline
    hold = hold substr($0, 2)
    next
}

/^rdcPosition:/ {
    line = $0
    while (line !~ "</position>") {
        getline
        line = line substr($0, 2)
    }

    if (line ~ "o=acme#6#")
        hold = hold ORS line
}

END {
    if (hold ~ "o=acme#6#")
        print hold
}

As chamadas substr($0, 2) removerão o espaço inicial das linhas quebradas na entrada.

Ambos os scripts assumem que a linha dn está dividida em exatamente duas linhas.

    
por 06.08.2018 / 18:01