Não é estritamente grep
sozinho, mas isso é o truque:
while IFS= read -r pattern; do
grep "$pattern" input | awk -v drug="$pattern" 'BEGIN {OFS="\t"} { print drug,$0}'
done < "patterns"
Eu preciso analisar os nomes dos medicamentos dos resumos do Medline. Eu estava esperando fazer isso obtendo saídas de grep -wf
e grep -owf
usando colar, mas as saídas não correspondem, porque grep -owf
cria uma saída para cada correspondência, mesmo se estiver na mesma linha.
Arquivo padrão:
DrugA
DrugB
DrugC
DrugD
Arquivo a analisar:
In our study, DrugA and DrugB were found to be effective. DrugA was more effective than DrugB.
In our study, DrugC was found to be effective
In our study, DrugX was found to be effective
Saída desejada:
DrugA In our study, DrugA and DrugB were found to be effective. DrugA was more effective.
DrugB In our study, DrugA and DrugB were found to be effective. DrugA was more effective.
DrugC In our study, DrugC was found to be effective
Uma awk
maneira, talvez?
awk '
NR == FNR {
a[$0] = 1
n = length($0)
w = n > w ? n : w
next
}
{
for (i in a)
if ($0 ~ i)
printf "%-* s %s\n", w, i, $0
}
' pattern_file.txt data_file.txt
Uma solução sed
:
sed 's|.*|/&/{h;s/^/&\t/p;g}|' pattern_file | sed -nf - input
Tags grep text-processing awk sed