Usando uma tabela ou script em sed para substituir muitos caracteres especiais por caracteres de escape?

Question

Usando uma tabela ou script em sed para substituir muitos caracteres especiais por caracteres de escape?

#1 resposta do (2 votos)

1

Se você quiser substituir caracteres especiais usando sed você pode usar diferentes maneiras, mas o problema é que você tem que substituir muitos (100 +) caracteres especiais por caracteres de escape em muitos arquivos.

então precisa: (obrigado Peter)

^^ para escapar de um único ^ e ^| para escapar | e \& para escapar & e \/ para escapar /
\ para escapar de \

Suponha que você tenha mais de 100 exemplos de strings em muitos arquivos:

sed.exe -i "s/{\*)(//123/
sed -i "s/\/123/g;" 1.txt
sed.exe -i "s/{\*)(//123/
sed -i "s/\/123/g;" 1.txt
.....
.....

essas strings contendo muitos caracteres especiais para escapar (temos mais de 100 strings) ..
Escapando manualmente é um trabalho muito longo .. Então eu preciso criar um script de tabela semelhante ao wReplace para chamar no prompt de comando para escapar caracteres especiais e depois substituindo-os por minhas palavras.
Como posso fazer?

command-line sed

por user143822 10.07.2012 / 16:15

1 resposta

Tags command-line sed

biblioteca LAME no Audacity Instalado o Delock PCI Express Card agora nusb3mon.exe bloco 1 cpu core

score 2 · Accepted Answer

Observe que ^^ para ^ e ^| para | e ^& para & ... não são um requisito de sed . O ^ caractere de escape é requerido pelo CMD-shell. Se o seu texto não estiver exposto à linha de comando nem a um parâmetro de comando em um script de comando .cmd / .bat, você só precisará considerar o caractere de escape sed que é uma barra invertida \ . Eles são dois escopos bastante separados (que podem se sobrepor, por isso é sempre melhor manter tudo com o escopo do sed, como segue abaixo).

Aqui está um script sed que irá substituir qualquer número de strings de busca que você sepcify, com suas complementares string de substituição . O formato geral das strings é um cruzamento entre um comando sed de substituição ( s / abc / xyz / p ) e um formato tabular. Você pode "esticar" o delimitador do meio para alinhar as coisas.
Você pode usar um padrão de string FIXED ( F /...// em>), ou um padrão de expressão regular normal sed-style ( s /...// em >) ... e você pode ajustar sed -n e cada /p (em table.txt) conforme necessário.

Você precisa de 3 arquivos para uma execução mínima (e um 4º, dinamicamente recuperado de table.txt):

o script principal table-to-regex.sed

o arquivo de tabela table.txt

o arquivo de destino file-to-chanage.txt

script revisto table-derrived.sed

Para executar uma tabela em um arquivo de destino.

sed -nf table-to-regex.sed table.txt > table-derrived.sed # Here, check 'table-derrived.sed' for errors as described in the example *table.txt*. sed -nf table-derrived.sed file-to-change.txt # Redirect *sed's* output via '>' or '>>' as need be, or use 'sed -i -nf'

Se você quiser executar table.txt em muitos arquivos, basta colocar o trecho de código acima em um simples loop para atender às suas necessidades. Eu posso fazer isso trivialmente em bash , mas alguém mais ciente do Windows CMD-shell seria mais adequado do que eu para configurar isso.

Aqui está o script: table-to-regex.sed

s/[[:space:]]*$// # remove trailing whitespace /^$\|^[[:space:]]*#/{p; b} # empty and sed-style comment lines: print and branch # printing keeps line numbers; for referencing errors /^$[Fs]$$.$$.*$\{4\}/{ # too many delims ERROR s/^/# error + # /p # print a flagged/commented error b } # branch /^$[Fs]$$.$$.*$\{3\}/{ # this may be a long-form 2nd delimiter /^$[Fs]$$.$$.*[[:space:]]*.*$/{ # is long-form 2nd delimiter OK? s/^$[Fs]$$.$$.*$[[:space:]]*$.*$$.*$/\n\n\n/ t OK # branch on true to :OK }; s/^/# error L # /p # print a flagged/commented error b } # branch: long-form 2nd delimiter ERROR /^$[Fs]$$.$$.*$\{2\}/{ # this may be short-form delimiters /^$[Fs]$$.$$.*.*$/{ # is short-form delimiters OK? s/^$[Fs]$$.$$.*$$.*$$.*$/\n\n\n/ t OK # branch on true to :OK }; s/^/# error S # /p # print a flagged/commented error b } # branch: short-form delimiters ERROR { s/^/# error - # /p # print a flagged/commented error b } # branch: too few delimiters ERROR :OK # delimiters are okay #============================ h # copy the pattern-space to the hold space # NOTE: /^s/ lines are considered to contain regex patterns, not FIXED strings. /^s/{ s/^s$.$\n/s/ # shrink long-form delimiter to short-form :s; s/^s$.$$[^\n]*$\n/s/; t s # branch on true to :s p; b } # print and branch # The following code handles FIXED-string /^F/ lines s/^F.\n$[^\n]*$\n.*// # isolate the literal find-string in the pattern-space s/[]\/$*.^|[]/\&/g # convert the literal find-string into a regex of itself H # append \n + find-regex to the hold-space g # Copy the modified hold-space back into the pattern-space s/^F.\n[^\n]*\n$[^\n]*$\n.*// # isolate the literal repl-string in the pattern-space s/[\/&]/\&/g # convert the literal repl-string into a regex of itself H # append \n + repl-regex to the hold-space g # Copy the modified hold-space back into the pattern-space # Rearrange pattern-space into a / delimited command: s/find/repl/... s/^$F.$\n$[^\n]*$\n$[^\n]*$\n$[^\n]*$\n$[^\n]*$\n$[^\n]*$$/s\/\/\// p # Print the modified find-and-replace regular expression line

Aqui está um arquivo de tabela de exemplo, com uma descrição de como ele funciona: table.txt

# The script expects an input table file, which can contain # comment, blank, and substitution lines. The text you are # now reading is part of an input table file. # Comment lines begin with optional whitespace followed by # # Each substitution line must start with: 's' or 'F' # 's' lines are treated as a normal 'sed' substitution regular expressions # 'F' lines are considered to contain 'FIXED' (literal) string expressions # The 's' or 'F' must be followed by the 1st of 3 delimiters # which must not appear elsewhere on the same line. # A pre-test is performed to ensure conformity. Lines with # too many or too few delimiters, or no 's' or 'F', are flagged # with the text '# error ? #', which effectively comments them out. # '?' can be: '-' too few, '+' too many, 'L' long-form, 'S' short-form # Here is an example of a long-form error, as it appears in the output. # error L # s/example/(7+3)/2=5/ # 1st delimiter, eg '/' must be a single character. # 2nd (middle) delimiter has two possible forms: # Either it is exactly the same as the 1st delimiter: '/' (short-form) # or it has a double-form for column alignment: '/ /' (long-form) # The long-form can have any anount of whitespace between the 2 '/'s # 3rd delimiter must be the same as the 1st delimiter, # After the 3rd delimiter, you can put any of sed's # substitution commands, eg. 'g' # With one condition, a trailing '#' comment to 's' and 'F' lines is # valid. The condition is that no delimiter character can be in the # comment (delimiters must not appear elsewhere on the same line) # For 's' type lines, it is implied that *you* have included all the # necessary sed-escape characters! The script does not add any # sed-escape characters for 's' type lines. It will, however, # convert a long-form middle-delimiter into a short-form delimiter. # For 'F' type lines, it is implied that both strings (find and replace) # are FIXED/literal-strings. The script does add the necessary # sed-escape characters for 'F' type lines. It will also # convert a long-form middle-delimiter into a short-form delimiter. # The result is a sed-script which contains one sed-substitution # statement per line; it is just a modified version of your # 's' and 'F' strings "table" file. # Note that the 1st delimiter is *always* in column 2. # Here are some sample 's' and 'F' lines, with comments: # F/abc/ABC/gp #-> These 3 are the same for 's' and 'F', s/abc/ABC/gp #-> as no characters need to be escaped, s/abc/ /ABC/gp #-> and the 2nd delimiter shrinks to one F/^F=Fixed/ /okay/p # is okay here, It is a FIXED literal s|^s=sed regex||FAIL|p # will FAIL: back-reference not defined! F|\\|////| # this line == next line F|\\| |////|p # this line == previous line s|\\| |////|p # this line is different; 's' vs 'F' F_Hello! ^.&'//\*$/['{'$";"'_ _Ciao!_ # literal find / replace

Aqui está um exemplo de arquivo de entrada cujo texto você deseja alterar: arquivo-para-chanage.txt

abc abc ^F=Fixed s=sed regex \\ \\ \\ \\ Hello! ^.&'//\*$/['{'$";"' some non-matching text