Alinhamento de texto complexo no bash

6

Dada essa entrada:

# Lines starting with # stay the same
# Empty lines stay the same
# only lines with comments should change

ls  # show all major directories
              # and other things

cd      # The cd command - change directory  
            # will allow the user to change between file directories

touch             # The touch command, the make file command 
                # allows users to make files using the Linux CLI #  example, cd ~

bar foo baz # foo foo foo

Eu preciso manter as linhas que começam com # e as linhas que não contêm comentários, mas alinhem todos os outros comentários na mesma coluna.

Saída desejada:

# Lines starting with # stay the same
# Empty lines stay the same
# Only lines with # in middle should change and be aligned

ls              # show all major directories
                # and other things

cd              # The cd command - change directory  
                # will allow the user to change between file directories

touch           # The touch command, the make file command 
                # allows users to make files using the Linux CLI #  exmaple, cd ~

bar foo baz     # foo foo foo

Aqui o que eu tenho até agora:

# Building an array out of input
 while IFS=$'\n' read -r; do 
    lines+=("$REPLY") 
 done 

# Looping through array and selecting elemnts that need change 
for i in "${lines[@]}"
  do
    if  [[ ${i:0:1} == ';' || $i != *";"* ]];
      then
        echo "DOESNT CHANGE: #### $i"
    else 
        echo "HAS TO CHANGE: #### $i"
        array+=( "${i%%";"*}" );
        array2+=("${i##";"}")
    fi
done

# Trying to find the longest line to decide how much space I need to add for each element
max = ${array[0]}

for n in "${array[@]}" ; do
    ((${#n} > max)) && max=${#n}
    echo  "Length:" ${#n} ${n}
done

#Longest line
echo $max

# Loop for populating array 
for j in "${!array2[@]}" ; do
    echo "${array2[j]} " | sed -e "s/;/$(echo "-%20s ;") /g" 
done

Eu sinto que estou fazendo muito. Acho que deveria haver uma maneira mais fácil de resolver esse problema.

    
por Caleb 14.05.2018 / 06:53

3 respostas

14

Se todos os seus comandos e argumentos não contiverem # e um outro caractere (digamos, o caractere ASCII dado pelo byte 1), você poderá inserir esse outro caractere como um separador extra e usar column para alinhar os comentários (veja esta resposta ). Então, algo como:

$ sed $'s/#/
$ sed $'s/#/
$ sed $'s/#/
$ sed $'s/#/%pre%1#/;s/^$/%pre%1/' input-file | column -ts $'%pre%1'
# Lines starting with # stay the same
# Empty lines stay the same
# only lines with comments should change

ls                                        # show all major directories
                                          # and other things

cd                                        # The cd command - change directory
                                          # will allow the user to change between file directories

touch                                     # The touch command, the make file command
                                          # allows users to make files using the Linux CLI #  example, cd ~

bar foo baz                               # foo foo foo
1#/' input-file | column -ets $'%pre%1' # Lines starting with # stay the same # Empty lines stay the same # only lines with comments should change ls # show all major directories # and other things cd # The cd command - change directory # will allow the user to change between file directories touch # The touch command, the make file command # allows users to make files using the Linux CLI # example, cd ~ bar foo baz # foo foo foo
1#/;s/^$/%pre%1/' input-file | column -ts $'%pre%1' # Lines starting with # stay the same # Empty lines stay the same # only lines with comments should change ls # show all major directories # and other things cd # The cd command - change directory # will allow the user to change between file directories touch # The touch command, the make file command # allows users to make files using the Linux CLI # example, cd ~ bar foo baz # foo foo foo
1#/' input-file | column -ets $'%pre%1' # Lines starting with # stay the same # Empty lines stay the same # only lines with comments should change ls # show all major directories # and other things cd # The cd command - change directory # will allow the user to change between file directories touch # The touch command, the make file command # allows users to make files using the Linux CLI # example, cd ~ bar foo baz # foo foo foo

Se o column não suportar -e para evitar a eliminação de linhas vazias, você poderá adicionar algo a linhas vazias (por exemplo, um espaço ou o caractere separador usado acima):

%pre%     
por 14.05.2018 / 07:10
5

O processamento de texto apenas com o shell é um pouco estranho e pode ser propenso a erros (consulte " Por que está usando um loop de shell para processar texto considerado má prática? "). Geralmente é melhor usar e outras linguagens de programação para tarefas como essas.

perl -ne 'if (/^([^#]+?)\s*#(.*)$/) { printf("%-16s#%s\n", $1, $2) } else { print }' file

Isso usa o Perl para capturar o bit na frente do # (descartando espaços entre a última palavra e o # ) e o bit depois. Se a correspondência foi bem-sucedida, ele aloca 16 locais de caracteres para o texto e imprime o texto formatado e o comentário. Se a correspondência não foi bem-sucedida (porque a linha estava em branco ou começou com um # ), a linha é impressa sem modificação.

# Lines starting with # stay the same
# Empty lines stay the same
# only lines with comments should change

ls              # show all major directories
                # and other things

cd              # The cd command - change directory
                # will allow the user to change between file directories

touch           # The touch command, the make file command
                # allows users to make files using the Linux CLI #  example, cd ~

bar foo baz     # foo foo foo
    
por 14.05.2018 / 07:31
3

Aqui está um script Python que deve fazer o que você quer:

#!/usr/bin/env python
# -*- encoding: ascii -*-
"""align.py"""

import re
import sys

# Read the data from the file into a list
lines = []
with open(sys.argv[1], 'r') as textfile:
    lines = textfile.readlines()

# Iterate through the data once to get the maximum indentation
max_indentation = 0
comment_block = False
for line in lines:

    # Check for the end of a comment block
    if comment_block:
        if not re.match(r'^\s*#.*$', line):
            comment_block = False

    # Check for the beginning of a comment block
    else:
        if re.match(r'^[^#]*[^ #].*#.*$', line):
            comment_block = True
            indentation = line.index('#')
            max_indentation = max(max_indentation, indentation)

# Iterate through the data a second time and output the reformatted text
comment_block = False
for line in lines:
    if comment_block:
        if re.match(r'^\s*#.*$', line):
            line = ' ' * max_indentation + line.lstrip()
        else:
            comment_block = False
    else:
        if re.match(r'^[^#]*[^ #].*#.*$', line):
            pre, sep, suf = line.partition('#')
            line = pre.ljust(max_indentation) + sep + suf
            comment_block = True

    sys.stdout.write(line)

Execute assim:

python align.py input.txt

Produz a seguinte saída:

# Lines starting with # stay the same
# Empty lines stay the same
# only lines with comments should change

ls                # show all major directories
                  # and other things

cd                # The cd command - change directory  
                  # will allow the user to change between file directories

touch             # The touch command, the make file command 
                  # allows users to make files using the Linux CLI #  example, cd ~

bar foo baz       # foo foo foo
    
por 14.05.2018 / 07:40