salva página da web com todo o conteúdo relacionado

1

Estou tentando descobrir como posso salvar uma página da Web com todos os arquivos relacionados, por exemplo: link

Eu quero salvar todos os arquivos no diretório como um rastreador, mas mais limitado e, se possível, no firefox

    
por maazza 30.11.2015 / 10:36

1 resposta

0

estranhamente, a resposta foi deletada de alguma forma.

aqui está a resposta:

wget -r -l2 http://docs.oasis-open.org/ubl/os-UBL-2.0/xsd

ou

wget -r -np http://docs.oasis-open.org/ubl/os-UBL-2.0/xsd

consulte o link

‘-np’ ‘--no-parent’ ‘no_parent = on’

The simplest, and often very useful way of limiting directories is disallowing retrieval of the links that refer to the hierarchy above

than the beginning directory, i.e. disallowing ascent to the parent directory/directories.

The ‘--no-parent’ option (short ‘-np’) is useful in this case. Using it guarantees that you will never leave the existing hierarchy.

Supposing you issue Wget with:

wget -r --no-parent http://somehost/~luzer/my-archive/

You may rest assured that none of the references to /~his-girls-homepage/ or /~luzer/all-my-mpegs/ will be followed. Only

the archive you are interested in will be downloaded. Essentially, ‘--no-parent’ is similar to ‘-I/~luzer/my-archive’, only it handles redirections in a more intelligent fashion.

Note that, for HTTP (and HTTPS), the trailing slash is very important to ‘--no-parent’. HTTP has no concept of a “directory”—Wget

relies on you to indicate what’s a directory and what isn’t. In ‘http://foo/bar/’, Wget will consider ‘bar’ to be a directory, while in ‘http://foo/bar’ (no trailing slash), ‘bar’ will be considered a filename (so ‘--no-parent’ would be meaningless, as its parent is ‘/’).

    
por 01.12.2015 / 10:19