Equivalente de Powershell para comutadores wget “-nc” e “-i” [duplicado]

1

Qual é o equivalente a Powershell deste comando wget?

wget -nc -i downloadList.txt

Onde

-i downloadList.txt
downloads a list of urls' in the specified file.

-nc
skips the already downloaded files.

    
por Renuka 04.10.2015 / 09:10

2 respostas

1

Você pode usar não apenas cmdlets do PowerShell, mas também classes .Net.

Para -nc part, obtenha o conteúdo de um arquivo e selecione somente strings exclusivas com cat (alias para Get-Content ) e sort (alias para Sort-Object ). Em seguida, use wget (alias para Invoke-WebRequest ) nessa lista de strings, extraindo o nome do arquivo de saída de URLs com GetFileName

cat downloadList.txt | foreach {wget $_ -OutFile ([System.IO.Path]::GetFileName($_))}
    
por 04.10.2015 / 11:05
1

Não há switch Powershell que se comporte exatamente como -nc

Ele não apenas evita a substituição de um arquivo. Ele também verifica se o arquivo de destino já existe e não inicia um segundo download. O ponto inteiro de -nc é impedir o download real.

-nc

--no-clobber

If a file is downloaded more than once in the same directory, Wget's behavior depends on a few options, including -nc. In certain cases, the local file will be clobbered, or overwritten, upon repeated download. In other cases it will be preserved.

When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. If that file is downloaded yet again, the third copy will be named file.2, and so on. When -nc is specified, this behavior is suppressed, and Wget will refuse to download newer copies of file. Therefore, "no-clobber" is actually a misnomer in this mode. It's not clobbering that's prevented (as the numeric suffixes were already preventing clobbering), but rather the multiple version saving that's prevented.

When running Wget with -r, but without -N or -nc, re-downloading a file will result in the new copy simply overwriting the old. Adding -nc will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored.

When running Wget with -N, with or without -r, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file. -nc may not be specified at the same time as -N.

Note that when -nc is specified, files with the suffixes .html or (yuck) .htm will be loaded from the local disk and parsed as if they had been retrieved from the Web.

Então, no Powershell V3 você tem que imitar esse comportamento. Em poucas palavras:

  • Obter todos os nomes de base (sem extensão) de arquivos em uma determinada pasta (destino de download)
  • Obter todos os URLs de um determinado arquivo de texto (downloadList.txt)
  • Compare as duas listas e recupere URLs ausentes
  • Enviar somente URLs ausentes para Invoke-Webrequest e anexar html como extensão
$folder = "D:\my\folder"
Compare $(Dir $folder).BaseName (gc "D:\downloadList.txt")  -PassThru | 
    where {$_.SideIndicator -eq '=>'} | 
    foreach { wget $_ -OutFile "$folder\$_.html" }

E não-golfado

$folder = "D:\my\folder"
$exists = $(Get-ChildItem $folder).BaseName
$urls = Get-Content "D:\downloadList.txt" 
$missing = Compare $exists $urls  -PassThru | where {$_.SideIndicator -eq '=>'}
$missing  | foreach { Invoke-WebRequest -Uri $_ -OutFile "$folder\$_.html" }
    
por 04.10.2015 / 11:12