Tente o próximo script sed
:
Conteúdo de infile
:
odd even
one test of bigrams
Conteúdo de script.sed
:
## Inside square brackets there are two characters: space and tab.
## The instruction deletes them of the line.
s/[ ]*//g
## Label 'b'.
:b
## Copy line to 'hold space'.
h
## Get first bigram.
s/\(..\)\(..\).*/ -> /
## If last substitution succeed, continue to label 'a'.
ta
## Here last substitution failed: It means that line has less than four
## characters to extract a bigram, so read next line.
b
## Label 'a'
:a
## Print.
p
## Copy 'hold space' into 'pattern space'.
g
## Delete first character.
s/^.//
## Goto label 'b' to repeat loop.
tb
Execute o script:
sed -nf script.sed infile
Resultado:
od -> de
dd -> ev
de -> ve
ev -> en
on -> et
ne -> te
et -> es
te -> st
es -> to
st -> of
to -> fb
of -> bi
fb -> ig
bi -> gr
ig -> ra
gr -> am
ra -> ms