Extraindo dados de tags no conjunto de resultados XML

1

Eu preciso buscar dados para 2 tags "estimado" e "fullSign" para todas as ocorrências neste conjunto de resultados.

RESULT SET:

<?xml version="1.0" encoding="UTF-8"?>
<resultSet xmlns="urn:trimet:arrivals" queryTime="1469138325745"><location desc="Morrison/SW 3rd Ave MAX Station" dir="Westbound" lat="45.5181811277907" lng="-122.675385866199" locid="8381" /><arrival block="9007" departed="true" dir="1" status="estimated" estimated="1469138452000" fullSign="MAX  Blue Line to Hillsboro" piece="1" route="100" scheduled="1469138250000" shortSign="Blue to Hillsboro" locid="8381" detour="false"><blockPosition feet="1901" at="1469138300978" heading="201" lat="45.5214364" lng="-122.6716177"><trip desc="Hatfield Government Center" dir="1" route="100" tripNum="6557314" destDist="77046" pattern="54" progress="75145" /></blockPosition></arrival><arrival block="9050" departed="true" dir="1" status="estimated" estimated="1469138664000" fullSign="MAX  Red Line to City Center &amp; Beaverton" piece="1" route="90" scheduled="1469138670000" shortSign="Red Line to Beaverton" locid="8381" detour="false"><blockPosition feet="4552" at="1469138313683" heading="237" lat="45.5277621" lng="-122.6687878"><trip desc="Beaverton TC Pocket" dir="1" route="90" tripNum="6556307" destDist="66321" pattern="15" progress="61769" /></blockPosition></arrival><arrival block="9018" departed="true" dir="1" status="estimated" estimated="1469139140000" fullSign="MAX  Blue Line to Hillsboro" piece="1" route="100" scheduled="1469139150000" shortSign="Blue to Hillsboro" locid="8381" detour="false"><blockPosition feet="13687" at="1469138320005" heading="239" lat="45.5309688" lng="-122.6350333"><trip desc="Hatfield Government Center" dir="1" route="100" tripNum="6557315" destDist="77046" pattern="54" progress="63359" /></blockPosition></arrival><arrival block="9043" departed="true" dir="1" status="estimated" estimated="1469139577000" fullSign="MAX  Red Line to City Center &amp; Beaverton" piece="1" route="90" scheduled="1469139570000" shortSign="Red Line to Beaverton" locid="8381" detour="false"><blockPosition feet="31909" at="1469138310486" heading="285" lat="45.5320383" lng="-122.5738342"><trip desc="Beaverton TC Pocket" dir="1" route="90" tripNum="6556308" destDist="66321" pattern="15" progress="34412" /></blockPosition></arrival></resultSet>

resultado esperado:

1469138452000 MAX  Blue Line to Hillsboro
1469138664000 MAX  Red Line    to City Center &amp; Beaverton 
1469139140000 MAX  Blue Line  to    Hillsboro 
1469139577000 MAX  Red Line to City Center &amp;Beaverton

O que é uma boa maneira de extrair esses dados?

    
por Sunnx 22.07.2016 / 01:14

2 respostas

2

Isso está usando XMLstarlet com paste . Provavelmente pode ser feito em uma única chamada para o XMLstarlet, mas não sou um assistente:

$ paste <(xml sel -T -t -v '//@estimated' data.xml) \
        <(xml sel -T -t -v '//@fullSign' data.xml)
1469138452000   MAX Blue Line to Hillsboro
1469138664000   MAX Red Line to City Center & Beaverton
1469139140000   MAX Blue Line to Hillsboro
1469139577000   MAX Red Line to City Center & Beaverton
    
por 23.07.2016 / 09:18
1
$ xml2 < sunnx.xml | awk -F= '
   $1 ~ /@fullSign/  { fs=$2 ; sub(/&/,"&amp;",fs) };
   $1 ~ /@estimated/ { est=$2 };
   fs && est         { printf "%s %s\n", est, fs; fs=est="" }'
1469138452000 MAX  Blue Line to Hillsboro
1469138664000 MAX  Red Line to City Center &amp; Beaverton
1469139140000 MAX  Blue Line to Hillsboro
1469139577000 MAX  Red Line to City Center &amp; Beaverton

Se você quiser um literal & em vez de &amp; , elimine a chamada da função sub() . xml2 decodifica as entidades codificadas para você, então eu adicionei o sub() para alterá-lo de volta para estar de acordo com a saída solicitada.

Sem o sub() , a saída é assim:

1469138452000 MAX  Blue Line to Hillsboro
1469138664000 MAX  Red Line to City Center & Beaverton
1469139140000 MAX  Blue Line to Hillsboro
1469139577000 MAX  Red Line to City Center & Beaverton
    
por 23.07.2016 / 08:56

Tags