O nome de arquivo mais longo em um diretório grande

Question

O nome de arquivo mais longo em um diretório grande

#1 resposta do Ravexina (4 votos)
#2 resposta do steeldriver (3 votos)

4

Em um diretório grande no meu sistema Ubuntu (> 140000 arquivos e > 200 subdiretórios), sei que em algum lugar existem dois arquivos com nomes muito longos para serem copiados para uma pasta do Windows (NTFS). Eu tentei e recebi duas mensagens de erro, mas não prestei atenção a quais subpastas estavam os arquivos.

Como posso encontrar os dois arquivos com os nomes mais longos?

command-line bash

por Jos 30.06.2017 / 22:34

2 respostas

3

Com base nos comentários, o que você realmente precisa neste caso é uma lista de todos os arquivos cujos nomes são maiores do que alguns caracteres - e, felizmente, isso é relativamente fácil usando find regex :

find $PWD -regextype posix-extended -regex '.*[^/]{255,}$'

Para um número tão grande de arquivos e diretórios, você provavelmente desejará evitar a classificação. Em vez disso, mantenha apenas um registro dos nomes dos arquivos mais longos e mais longos e dos caminhos completos deles:

find $PWD -printf '%p$ find $PWD -printf '%pfind $PWD -regextype posix-extended -regex '.*[^/]{255,}$'
' | gawk -v RS='find $PWD -printf '%p$ find $PWD -printf '%p%pre%' | gawk -v RS='%pre%' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > p[1][1]) {
      # bump the current longest to 2nd place
      p[2][1] = p[1][1]; p[2][2] = p[1][2];
      # store the new 1st place length and pathname
      p[1][1] = currlen; p[1][2] = $0;
    }
    else if (currlen > p[2][1]) {
      # store the new 2st place length and pathname
      p[2][1] = currlen; p[2][2] = $0;
    }
  }

  END {
      for (i in p[1]) printf "(%d) %d : %s\n", i, p[i][1], p[i][2];
  }'
' | awk -v RS='%pre%' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > l[1]) {
      # bump the current longest to 2nd place
      l[2] = l[1]; p[2] = p[1];
      # store the new 1st place length and pathname
      l[1] = currlen; p[1] = $0;
    }
    else if (currlen > l[2]) {
      # store the new 2st place length and pathname
      l[2] = currlen; p[2] = $0;
    }
  }

  END {
      for (i in l) printf "(%d) %d : %s\n", i, l[i], p[i];
  }'
' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > p[1][1]) {
      # bump the current longest to 2nd place
      p[2][1] = p[1][1]; p[2][2] = p[1][2];
      # store the new 1st place length and pathname
      p[1][1] = currlen; p[1][2] = $0;
    }
    else if (currlen > p[2][1]) {
      # store the new 2st place length and pathname
      p[2][1] = currlen; p[2][2] = $0;
    }
  }

  END {
      for (i in p[1]) printf "(%d) %d : %s\n", i, p[i][1], p[i][2];
  }'
' | awk -v RS='%pre%' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > l[1]) {
      # bump the current longest to 2nd place
      l[2] = l[1]; p[2] = p[1];
      # store the new 1st place length and pathname
      l[1] = currlen; p[1] = $0;
    }
    else if (currlen > l[2]) {
      # store the new 2st place length and pathname
      l[2] = currlen; p[2] = $0;
    }
  }

  END {
      for (i in l) printf "(%d) %d : %s\n", i, l[i], p[i];
  }'

ou com o GNU awk (que suporta arrays 2D)

%pre%

por steeldriver 01.07.2017 / 14:22

Tags command-line bash

Pacote “acl” no Ubuntu 16.04 Alterar o ícone da pasta de várias pastas, álbuns de música [duplicados]

score 4 · Accepted Answer

Eu acho que a solução do @ steeldriver é uma escolha melhor, mas aqui está minha solução alternativa, você pode usar uma combinação de comandos para encontrar exatamente dois (ou mais) nomes de arquivos mais longos.

find . | awk 'function base(f){sub(".*/", "", f); return f;} \
{print length(base($0)), $0}'| sort -nr | head -2

a saída seria como:

length ./path/to/file

Aqui um exemplo real:

42 ./path/to/this-file-got-42-character-right-here.txt
31 ./path/to/this-file-got-31-character.txt

Notas

find nos fornece uma lista de todos os arquivos desse diretório, como:

./path/to/this-file-got-31-character.txt

usando awk , adicionamos o tamanho do arquivo ao início de cada linha (é exatamente o tamanho do arquivo, não o caminho):

31 ./path/to/this-file-got-31-character.txt

finalmente, classificamos com base no tamanho do arquivo e obtemos as duas primeiras linhas usando head .