Extrai valores do arquivo html simples via grep / awk

1

Eu tenho outro problema grep / awk / sed dentro do código HTML anexado abaixo.

Dado é um arquivo HTML simples com tabela dentro. O HTML é gerado por um medidor inteligente (medidor de energia doméstico). Dentro da tabela html, são apresentados 2 valores que são importantes para mim: Pplus e Pminus. Estes são o poder real do GRID e a energia real vinda da minha planta solar.

Eu gostaria de pegar esses dois valores separadamente como "estável / seguro" para ficar menos propenso a erros. Meu entendimento é que a estrutura do html nunca muda. Como ponto de partida para encontrar os valores, 18.000 significa 18W de energia atual da rede de energia e 0.00000 significa 0W atualmente produzido a partir de minha usina solar (sua noite).

Para mim, é quase impossível encontrar alguma estrutura que possa ajudar a conquistar a posição certa, eu realmente aprecio sua visão de especialista aqui e se é possível fazer isso funcionar.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<HTML>
<HEAD>
<TITLE>NK-FW graphic by Lingg & Janke</TITLE>
<LINK REL="apple-touch-icon" HREF="/NKFW_icon57.png">
<META NAME="description" CONTENT="NK-FW graphic by Lingg & Janke">
<META HTTP-EQUIV="cache-control" CONTENT="no-cache">

<BASE TARGET="_top">

<STYLE TYPE="text/css">

BODY  {margin-left:0; margin-right:0; margin-top:0;}

A     {font-family:Arial; font-size:18px; font-weight:bold; color:#FFFFFF; }
TABLE {font-family:Arial; font-size:18px; font-weight:bold; color:#FFFFFF; }

INPUT {font-family:Arial; font-size:18px; font-weight:bold; color:#000000; }
SELECT{font-family:Arial; font-size:18px; font-weight:bold; color:#000000; }

.inputwidth  {width:190px;}
.keywidth    {width:190px;}

#idLJheadDiv {
               position:absolute;
               width:960px;
               height:50px;
               top:10px;
               left:0px;
               padding:0px;
               margin:0px;
               border:0px;
               background-color:#0074B2;
             }
#idLJheadTd1 {
               font-size:30px;
               font-family:Arial;
               font-weight:bold;
               font-style:normal;
               color:#FFFFFF;

               height:50px;
               text-align:left;
               vertical-align:middle;
             }
#idLJheadTd2 {
               font-size:20px;
               font-family:Arial;
               font-weight:bold;
               font-style:normal;
               color:#FFFFFF;

               height:50px;
               text-align:center;
               vertical-align:middle;
             }
#idLJheadTd3 {
               font-size:30px;
               font-family:Arial;
               font-weight:bold;
               font-style:italic;
               color:#FFFFFF;

               height:50px;
               text-align:right;
               vertical-align:middle;
             }


#idLJfootDiv {
               position:absolute;
               width:960px;
               height:50px;
               top:600px;
               left:0px;
               padding:0px;
               margin:0px;
               border:0px;
               background-color:#0074B2;
             }
#idLJfootTd  {
               font-size:30px;
               font-family:Arial;
               font-weight:bold;
               font-style:italic;
               color:#FFFFFF;

               height:50px;
               text-align:left;
               vertical-align:middle;
             }


#idButtonDiv {
               position:absolute;
               width:228px;
               height:50px;
               padding:0px;
               margin:0px;
               border:0px;
               background-color:#2f2f2f;
             }

#idButtonTd  {
               height:50px;
               vertical-align:middle;
             }

</STYLE>
</HEAD>

<!-- ************************** -->

<BODY SCROLL="auto"
      onResize="DoReposition();"
      onLoad="DoReposition();"
      BGCOLOR="#000000"
      TOPMARGIN=0
      LEFTMARGIN=0
      LINK=#ffffff
      VLINK=#ffffff
      ALINK=#ffffff >

<!-- ************************** -->

<SCRIPT LANGUAGE="JavaScript">
<!--
function HOffset()
{
   var window_width = window.innerWidth ? window.innerWidth : (document.body.clientWidth ? document.body.clientWidth : 0);
   return Math.max( 0, Math.floor( (window_width - 960) / 2 ) - 0 ).toString();
}

function VOffset()
{
   var window_height = window.innerHeight ? window.innerHeight : (document.body.clientHeight ? document.body.clientHeight : 0);
   return 0;
}

document.write( "<DIV ID='IDAlignPage' style='position:absolute; top:" + VOffset() + "px; left:" + HOffset() + "px;'>&nbsp;" );

function DoReposition()
 {var o='IDAlignPage';
   if(is_dom2&&document.getElementById(o))
    {var e=document.getElementById(o);e.style.left=HOffset()+'px';e.style.top=VOffset()+'px';}
   else if(is_ie&&is_major>=4&&eval('document.all.'+o))
    {var e=eval('document.all.'+o);e.style.left=HOffset()+'px';e.style.top=VOffset()+'px';}
   else if(is_nav&&is_major>=4&&eval('document.'+o))
    {var e=eval('document.'+o);e.left=HOffset();e.top=VOffset();}
 }

window.onresize=DoReposition;
window.onload=DoReposition;

//-->
</SCRIPT>








<!-- ************************** -->
<!-- *** upper + lower bar  *** -->

<DIV ID="idLJheadDiv">
  <TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0" WIDTH=100%><TR>
    <TD ID="idLJheadTd1">&nbsp;&nbsp;Smart Metering</TD>
    <TD ID="idLJheadTd2">
      SA
      &nbsp;&nbsp;
      21.06.2014
      &nbsp;&nbsp;
      21:57:02
      &nbsp;&nbsp;
      KW25
    </TD>
    <TD ID="idLJheadTd3">Lingg &amp; Janke&nbsp;&nbsp;</TD>
  </TR></TABLE>
</DIV>

<DIV ID="idLJfootDiv">
  <TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0" WIDTH=100%><TR>
    <TD ID="idLJfootTd">&nbsp;&nbsp;Energy Analyzer</TD>
  </TR></TABLE>
</DIV>


<!-- ************************** -->
<!-- *** 1. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:78px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
P&#043; in Watt

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:78px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:78px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:78px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 2. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:143px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
18.000

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:143px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g1.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="MykWh">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:143px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g2.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="Supply">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:143px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 3. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:208px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
P- in Watt

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:208px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g3.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 3">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:208px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g4.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 4">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:208px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 4. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:273px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
0.00000

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:273px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g5.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 5">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:273px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g6.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 6">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:273px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
P&#043;

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 5. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:338px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:338px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g7.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 7">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:338px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g8.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 8">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:338px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 6. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:403px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:403px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:403px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:403px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 7. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:468px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:468px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:468px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:468px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 8. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:533px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

<form action="/index.htm" method="GET">
<input type="submit" class="keywidth" value="ZURCK" name="A">
</form>

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:533px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:533px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:533px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

<form action="/mainset/mainset.htm" method="GET">
<input type="submit" class="keywidth" value="EINRICHTEN" name="A">
</form>

</TD></TR></TABLE>
</DIV>
<!-- -->


<!-- ************************** -->

</BODY>
</HTML>

Eu tentei os comandos mencionados acima, mas eles não parecem funcionar.

O comentário abaixo, sobre python, como isso funcionaria, alguém pode me ajudar? Eu não tenho nenhuma preferência, mas claro que prefiro a melhor solução que parece estado da arte em todas as direções (velocidade, eficiência, ....)

O HTML postado não muda, apenas os valores do curso são atualizados a cada 10 segundos.

    
por njordan 20.06.2014 / 23:58

2 respostas

3

> awk '/ID="idButtonTd"/ {printline=1; next;}; 
   printline==1 && /^[0-9]+\.[0-9]+$/ { print $0; }; { printline=0; }' file
18.000
0.00000
    
por 21.06.2014 / 00:33
2

Se a estrutura html for de fato constante, o seguinte deverá funcionar:

totalValues=$(grep -A1 "idButtonTd" yourfile | grep -v "idButtonTd" | grep -v "\-\-" | grep "^[0-9][0-9]*")
Pplus=$(echo $totalValues | awk '{ print $1 }')
Pminus=$(echo $totalValues | awk '{ print $2 }')
echo "Pplus = $Pplus"
echo "Pminus = $Pminus"
    
por 21.06.2014 / 00:41

Tags