Experimente o comando head
:
HEAD(1) User Commands HEAD(1)
NAME
head - output the first part of files
SYNOPSIS
head [OPTION]... [FILE]...
DESCRIPTION
Print the first 10 lines of each FILE to standard output. With more
than one FILE, precede each with a header giving the file name. With
no FILE, or when FILE is -, read standard input.
head
permite especificar o número de linhas. Consulte a man page para mais informações.
loop.py
:
#!/usr/bin/python'
i = 0
while True:
print "This is line " + str(i)
i += 1
loop.py
deve ser executado infinitamente, mas se eu canalizar sua saída para head
, obtenho:
$ ./loop.py | head
This is line 0
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7
This is line 8
This is line 9
Traceback (most recent call last):
File "./loop.py", line 6, in <module>
print "This is line " + str(i)
IOError: [Errno 32] Broken pipe
Observe que a parte do erro ( Traceback ...
) é, na verdade, stderr
, conforme demonstrado pela execução de ./loop.py 2> stderr.log | head
, portanto, você não precisa se preocupar com a saída da cabeça.
Finalmente, para pesquisar:
$ ./loop.py 2> /dev/null | head | grep -n "line 6"
7:This is line 6
Aqui, redirecionamos stderr
de loop.py
, embora tenhamos certeza de que isso não interferirá no texto processado por head
e grep
EDITAR
TL; DR : O agendador de CPU controla quanto o processo intensivo será executado após head
concluir sua saída.
Depois de alguns testes, descobri que minha solução, embora corte a execução de loop.py
, não é tão robusta quanto se pode fazer. Com essas modificações no meu loop.py
, canalizando sua saída para o cabeçalho produz:
novo loop.py
:
#!/usr/bin/env python
import sys
def doSomethingIntensive():
# actually do something intensive here
# that doesn't print to stdout
pass
i = 0
while True:
# printing to stderr so output is not piped
print >> sys.stderr, (
"Starting some calculation that "
"doesn't print to stdout")
doSomethingIntensive()
print >> sys.stderr, "About to print line " + str(i)
print "This is line " + str(i)
print >> sys.stderr, "Finished printing line " + str(i)
i += 1
e a saída:
$ ./loop.py | head
Starting some calculation that doesn't print to stdout
About to print line 0
Finished printing line 0
Starting some calculation that doesn't print to stdout
About to print line 1
Finished printing line 1
Starting some calculation that doesn't print to stdout
About to print line 2
Finished printing line 2
...
About to print line 247
Finished printing line 247This is line 0
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7
This is line 8
This is line 9
Starting some calculation that doesn't print to stdout
About to print line 248
Finished printing line 248
...
About to print line 487
Finished printing line 487
Starting some calculation that doesn't print to stdout
About to print line 488
Traceback (most recent call last):
File "./loop.py", line 18, in <module>
print "This is line " + str(i)
IOError: [Errno 32] Broken pipe
Escondi parte da saída e deixei apenas as partes relevantes. Em essência, a saída mostra que os fluxos de entrada / saída padrão de head
(e eu acho que todos os processos) estão em buffer.
De acordo com esta resposta em SO , quando o destinatário ( head
) termina, o canal é quebrado e * somente quando o remetente ( loop.py
) tenta gravar no pipe agora quebrado * será enviado um sinal SIGPIPE para ele.
Então, quando head
teve a chance de imprimir sua saída, tudo isso apareceu de uma só vez, mas somente depois que loop.py
continuou por outras 247 linhas. (Isso tem a ver com o agendamento de processos). Além disso, depois que head
imprimiu sua saída, mas antes de terminar, o agendador retomou loop.py
, então outras ~ 250 linhas (até 488) foram gravadas no pipe antes o cano estava quebrado.
Para melhores resultados, podemos usar E / S sem buffer (neste caso, saída sem buffer de loop.py
). Invocando o interpretador python com a opção -u
, obtemos:
$ python -u loop.py | head
Starting some calculation that doesn't print to stdout
About to print line 0
Finished printing line 0This is line 0
Starting some calculation that doesn't print to stdout
About to print line 1
Finished printing line 1This is line 1
Starting some calculation that doesn't print to stdout
About to print line 2
Finished printing line 2This is line 2
Starting some calculation that doesn't print to stdout
About to print line 3
Finished printing line 3This is line 3
Starting some calculation that doesn't print to stdout
About to print line 4
Finished printing line 4This is line 4
Starting some calculation that doesn't print to stdout
About to print line 5
Finished printing line 5This is line 5
Starting some calculation that doesn't print to stdout
About to print line 6
Finished printing line 6This is line 6
Starting some calculation that doesn't print to stdout
About to print line 7
Finished printing line 7This is line 7
Starting some calculation that doesn't print to stdout
About to print line 8
Finished printing line 8This is line 8
Starting some calculation that doesn't print to stdout
About to print line 9
Finished printing line 9
This is line 9
Starting some calculation that doesn't print to stdout
About to print line 10
Traceback (most recent call last):
File "loop.py", line 18, in <module>
print "This is line " + str(i)
IOError: [Errno 32] Broken pipe
Claro, isso é simples se o seu programa é escrito em python, já que você não precisa fazer modificações no código. No entanto, se estiver em C e você tiver a origem para isso, poderá usar a função setvbuf()
in stdio.h
para definir stdout
como sem buffer:
loop.c
:
#include <stdio.h>
#include <stdlib.h>
#define TRUE 1
unsigned long factorial(int n)
{
return (n == 0) ? 1 : n * factorial(n - 1);
}
void doSomethingIntensive(int n)
{
fprintf(stderr, "%4d: %18ld\n", n, factorial(n));
}
int main()
{
int i;
if (!setvbuf(stdout, NULL, _IONBF, 0)) /* the important line */
fprintf(stderr, "Error setting buffer size.\n");
for(i=0; TRUE; i++)
{
doSomethingIntensive(i);
printf("This is line %d\n", i);
}
return 0;
}