Baixando arquivos de um site com links de javascript

1

Às vezes, encontro sites que postam conteúdo (arquivos) como links de javascript. Nos casos em que os links são postados com a tradicional construção <a href="..."> , pode-se facilmente analisar o HTML, encontrar o link e baixar o conteúdo. Mesmo aplicativos como o Acrobat são capazes de lidar com isso e gerar um PDF da área relevante de um site.

Não é assim com links de javascript.

Aqui está um exemplo de um site que tem conteúdo (acesso público, sem necessidade de login ou senha), mas usa links de javascript.

Como se faz para baixar os arquivos PDF aqui programaticamente?

link

Existem guias para cada ano, leve este para 2013.

link

Existem várias centenas de links aqui, mas, além de clicar em cada um deles, não consigo encontrar nenhuma maneira de encontrar o alvo e baixá-los.

    
por amrith 04.02.2014 / 13:02

3 respostas

1

Duas opções vêm à mente (nenhuma delas Java):

  1. Escreva um bookmarklet JavaScript no qual você pode clicar no seu navegador e raspar os elementos DOM após a página que você deseja copiar foi carregada e o JS foi executado. Isso funcionará, mas não será dimensionado para um grande número de páginas.

  2. Use um navegador sem cabeçalho como o link , link ou link

por 04.02.2014 / 13:18
0

Você pode encontrá-lo com um console de desenvolvedor, olhando para a rede.

O URL é http://www.oml.ago.state.ma.us/default.aspx , com alguns parâmetros de postagem:

Host: www.oml.ago.state.ma.us
User-Agent: [...]
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
DNT: 1
Referer: http://www.oml.ago.state.ma.us/
Cookie: [...]
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 5713

__EVENTTARGET=ctl00%24ContentPlaceHolder1%24grdOML%24ctl02%24lnkOpenFile&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPDwUKLTI3MjY2NDEzNg9kFgJmD2QWAgIDD2QWAgIBD2QWCAIBD2QWAmYPFgIeBFRleHQFgQM8dGFibGUgd2lkdGg9JzcwJScgY2VsbHBhZGRpbmc9JzInIGNlbGxzcGFjaW5nPScyJyBib3JkZXI9JzAnPjx0cj48dGQgYmdjb2xvcj0nI2RjZGNkMCdhbGlnbj0nbGVmdCcgdmFsaWduPSdtaWRkbGUnY2xhc3M9J25hdmlnYXRpb25UZXh0J3dpZHRoPSc0MCUnPjxiPjxhIGhyZWY9J0RlZmF1bHQuYXNweD9zZWN0aW9uPTAnPkJyb3dzZSBPTUwgRGV0ZXJtaW5hdGlvbnM8L2E%2BPC9iPjwvdGQ%2BPHRkIGJnY29sb3I9JyNmMGYwZTgnYWxpZ249J2xlZnQnIHZhbGlnbj0nbWlkZGxlJ2NsYXNzPSduYXZpZ2F0aW9uVGV4dCd3aWR0aD0nMzUlJz48YSBocmVmPSdTZWFyY2guYXNweD9zZWN0aW9uPTEnPlNlYXJjaCBPTUwgRGV0ZXJtaW5hdGlvbnM8L2E%2BPC90ZD48L3RyPjwvdGFibGU%2BZAIDD2QWAmYPFgIfAAWbBjx0YWJsZSB3aWR0aD0nMTAwJScgY2VsbHBhZGRpbmc9JzInIGNlbGxzcGFjaW5nPScyJyBib3JkZXI9JzAnPjx0cj48dGQgYmdjb2xvcj0nI2RjZGNkMCdhbGlnbj0nbGVmdCcgdmFsaWduPSd0b3AnY2xhc3M9J25hdmlnYXRpb25UZXh0J3dpZHRoPScyMiUnPjxiPjxhIGhyZWY9J0RlZmF1bHQuYXNweD9zZWN0aW9uWWVhcj0wJnllYXI9MjAxNCc%2BMjAxNDwvYT48L2I%2BPC90ZD48dGQgYmdjb2xvcj0nI2YwZjBlOCdhbGlnbj0nbGVmdCcgdmFsaWduPSd0b3AnY2xhc3M9J25hdmlnYXRpb25UZXh0J3dpZHRoPScxOS41JSc%2BPGEgaHJlZj0nRGVmYXVsdC5hc3B4P3NlY3Rpb25ZZWFyPTEmeWVhcj0yMDEzJz4yMDEzPC9hPjwvdGQ%2BPHRkIGJnY29sb3I9JyNmMGYwZTgnYWxpZ249J2xlZnQnIHZhbGlnbj0ndG9wJ2NsYXNzPSduYXZpZ2F0aW9uVGV4dCd3aWR0aD0nMTkuNSUnPjxhIGhyZWY9J0RlZmF1bHQuYXNweD9zZWN0aW9uWWVhcj0yJnllYXI9MjAxMic%2BMjAxMjwvYT48L3RkPjx0ZCBiZ2NvbG9yPScjZjBmMGU4J2FsaWduPSdsZWZ0JyB2YWxpZ249J3RvcCdjbGFzcz0nbmF2aWdhdGlvblRleHQnd2lkdGg9JzE5LjUlJz48YSBocmVmPSdEZWZhdWx0LmFzcHg%2Fc2VjdGlvblllYXI9MyZ5ZWFyPTIwMTEnPjIwMTE8L2E%2BPC90ZD48dGQgYmdjb2xvcj0nI2YwZjBlOCdhbGlnbj0nbGVmdCcgdmFsaWduPSd0b3AnY2xhc3M9J25hdmlnYXRpb25UZXh0J3dpZHRoPScxOS41JSc%2BPGEgaHJlZj0nRGVmYXVsdC5hc3B4P3NlY3Rpb25ZZWFyPTQmeWVhcj0yMDEwJz4yMDEwPC9hPjwvdGQ%2BPC90cj48L3RhYmxlPmQCBQ8QZA8WAWYWARAFDy0tUHJpb3IgWWVhcnMtLQUPLS1QcmlvciBZZWFycy0tZxYBZmQCBw88KwANAQAPFgQeC18hRGF0YUJvdW5kZx4LXyFJdGVtQ291bnQCDWQWAmYPZBYcAgEPZBYKZg9kFgICAQ8PFgQfAAUKMDEvMzEvMjAxNB4PQ29tbWFuZEFyZ3VtZW50BV5PTUwtMjAxNC03LVNlZWtvbmstQW5pbWFsLVNoZWx0ZXItQnVpbGRpbmctQ29tbWl0dGVlLWFuZC1TZWVrb25rLUJvYXJkLW9mLVNlbGVjdG1lbi5wZGY7Mzg0NzM3ZGQCAQ8PFgIfAAUKT01MIDIwMTQtN2RkAgIPDxYCHwAFKVNlZWtvbmsgQW5pbWFsIFNoZWx0ZXIgQnVpbGRpbmcgQ29tbWl0dGVlZGQCAw8PFgIfAAUFTG9jYWxkZAIEDw8WAh8ABQYmbmJzcDtkZAICD2QWCmYPZBYCAgEPDxYEHwAFCjAxLzI3LzIwMTQfAwUwT01MLTIwMTQtNi1Ib2xsYW5kLUJvYXJkLW9mLVNlbGVjdG1lbi5wZGY7Mzg2Njg2ZGQCAQ8PFgIfAAUKT01MIDIwMTQtNmRkAgIPDxYCHwAFGkhvbGxhbmQgQm9hcmQgb2YgU2VsZWN0bWVuZGQCAw8PFgIfAAUFTG9jYWxkZAIEDw8WAh8ABQdIb2xsYW5kZGQCAw9kFgpmD2QWAgIBDw8WBB8ABQowMS8yNy8yMDE0HwMFLU9NTC0yMDE0LTUtTG9uZ21lYWRvdy1TZWxlY3QtQm9hcmQucGRmOzM4MDc4OGRkAgEPDxYCHwAFCk9NTCAyMDE0LTVkZAICDw8WAh8ABRdMb25nbWVhZG93IFNlbGVjdCBCb2FyZGRkAgMPDxYCHwAFBUxvY2FsZGQCBA8PFgIfAAUKTG9uZ21lYWRvd2RkAgQPZBYKZg9kFgICAQ8PFgQfAAUKMDEvMjcvMjAxNB8DBTQxLTI3LTE0LUVzc2V4LUJvYXJkLW9mLVNlbGVjdG1lbl9SZWRhY3RlZC5wZGY7MzkxMTg4ZGQCAQ8PFgIfAAUHMS0yNy0xNGRkAgIPDxYCHwAFGEVzc2V4IEJvYXJkIG9mIFNlbGVjdG1lbmRkAgMPDxYCHwAFBUxvY2FsZGQCBA8PFgIfAAUFRXNzZXhkZAIFD2QWCmYPZBYCAgEPDxYEHwAFCjAxLzI3LzIwMTQfAwU%2BMS0yNy0xNC1TdHVyYnJpZGdlLUNvbnNlcnZhdGlvbi1Db21taXNzaW9uX1JlZGFjdGVkLnBkZjszODk1MzdkZAIBDw8WAh8ABQcxLTI3LTE0ZGQCAg8PFgIfAAUiU3R1cmJyaWRnZSBDb25zZXJ2YXRpb24gQ29tbWlzc2lvbmRkAgMPDxYCHwAFBUxvY2FsZGQCBA8PFgIfAAUKU3R1cmJyaWRnZWRkAgYPZBYKZg9kFgICAQ8PFgQfAAUKMDEvMjEvMjAxNB8DBTlPTUwtMjAxNC00LU1hc3NhY2h1c2V0dHMtQm9hcmQtb2YtQm9pbGVyLVJ1bGVzLnBkZjszODA4MTVkZAIBDw8WAh8ABQpPTUwgMjAxNC00ZGQCAg8PFgIfAAUVQm9hcmQgb2YgQm9pbGVyIFJ1bGVzZGQCAw8PFgIfAAUFU3RhdGVkZAIEDw8WAh8ABQZCb3N0b25kZAIHD2QWCmYPZBYCAgEPDxYEHwAFCjAxLzIxLzIwMTQfAwUyMS0yMS0xNC1DYW1icmlkZ2UtQ2l0eS1Db3VuY2lsX1JlZGFjdGVkLnBkZjszODY4MjhkZAIBDw8WAh8ABQcxLTIxLTE0ZGQCAg8PFgIfAAUWQ2FtYnJpZGdlIENpdHkgQ291bmNpbGRkAgMPDxYCHwAFBUxvY2FsZGQCBA8PFgIfAAUJQ2FtYnJpZGdlZGQCCA9kFgpmD2QWAgIBDw8WBB8ABQowMS8yMS8yMDE0HwMFOTEtMjEtMTQtU3R1cmJyaWRnZS1Cb2FyZC1vZi1TZWxlY3RtZW5fUmVkYWN0ZWQucGRmOzM5NDIzOWRkAgEPDxYCHwAFBzEtMjEtMTRkZAICDw8WAh8ABR1TdHVyYnJpZGdlIEJvYXJkIG9mIFNlbGVjdG1lbmRkAgMPDxYCHwAFBUxvY2FsZGQCBA8PFgIfAAUKU3R1cmJyaWRnZWRkAgkPZBYKZg9kFgICAQ8PFgQfAAUKMDEvMjEvMjAxNB8DBUcxLTIxLTE0LVByb3ZpbmNldG93bi1IaXN0b3JpY2FsLURpc3RyaWN0LUNvbW1pc3Npb25fUmVkYWN0ZWQucGRmOzM3NTgxNGRkAgEPDxYCHwAFBzEtMjEtMTRkZAICDw8WAh8ABSlQcm92aW5jZXRvd24gSGlzdG9yaWMgRGlzdHJpY3QgQ29tbWlzc2lvbmRkAgMPDxYCHwAFBUxvY2FsZGQCBA8PFgIfAAUMUHJvdmluY2V0b3duZGQCCg9kFgpmD2QWAgIBDw8WBB8ABQowMS8xMy8yMDE0HwMFMU9NTC0yMDE0LTMtRWdyZW1vbnQtQm9hcmQtb2YtU2VsZWN0bWVuLnBkZjszNzgyMTdkZAIBDw8WAh8ABQpPTUwgMjAxNC0zZGQCAg8PFgIfAAUbRWdyZW1vbnQgQm9hcmQgb2YgU2VsZWN0bWVuZGQCAw8PFgIfAAUFTG9jYWxkZAIEDw8WAh8ABQhFZ3JlbW9udGRkAgsPZBYKZg9kFgICAQ8PFgQfAAUKMDEvMTMvMjAxNB8DBUxPTUwtMjAxNC0yLU1pbnV0ZW1hbi1SZWdpb25hbC1UZWNobmljYWwtU2Nob29sLURpc3RyaWN0LUNvbW1pdHRlZS5wZGY7MzcwMzcxZGQCAQ8PFgIfAAUKT01MIDIwMTQtMmRkAgIPDxYCHwAFOE1pbnV0ZW1hbiBSZWdpb25hbCBWb2NhdGlvbmFsIFRlY2huaWNhbCBTY2hvb2wgQ29tbWl0dGVlZGQCAw8PFgIfAAURUmVnaW9uYWwvRGlzdHJpY3RkZAIEDw8WAh8ABQYmbmJzcDtkZAIMD2QWCmYPZBYCAgEPDxYEHwAFCjAxLzEzLzIwMTQfAwU3MS0xMy0xNC1Bc2hmaWVsZC1Cb2FyZC1vZi1TZWxlY3RtZW5fUmVkYWN0ZWQucGRmOzM3MDI2NmRkAgEPDxYCHwAFBzEtMTMtMTRkZAICDw8WAh8ABRVBc2hmaWVsZCBTZWxlY3QgQm9hcmRkZAIDDw8WAh8ABQVMb2NhbGRkAgQPDxYCHwAFCEFzaGZpZWxkZGQCDQ9kFgpmD2QWAgIBDw8WBB8ABQowMS8wMi8yMDE0HwMFNU9NTC0yMDE0LTEtQm94Zm9yZC1ab25pbmctQm9hcmQtb2YtQXBwZWFscy5wZGY7MzY1NTAzZGQCAQ8PFgIfAAUKT01MIDIwMTQtMWRkAgIPDxYCHwAFH0JveGZvcmQgWm9uaW5nIEJvYXJkIG9mIEFwcGVhbHNkZAIDDw8WAh8ABQVMb2NhbGRkAgQPDxYCHwAFB0JveGZvcmRkZAIODw8WAh4HVmlzaWJsZWhkZBgBBSBjdGwwMCRDb250ZW50UGxhY2VIb2xkZXIxJGdyZE9NTA88KwAKAQgCAWQqRlzk94heDgb756WGG3iXbo2UvA%3D%3D&__EVENTVALIDATION=%2FwEWFAKH5NrcBAKbtOHzBQKO7pzyCQKY2J3zAwLlxIrvAwK9oZrLDQKN6YqwCgLFgqFvAsSCpY4JAq2B6YkBAsSC7agLAsuCkcwDAsqClesJArOBmZ0MAq6BncQIAuXatqUOAoDbuswKAoDbnm8C59qijgkCxNrmiQFU8mZCmbVka60Kj%2BqgzpL%2Fbfuz8A%3D%3D

Tentar esconder um URL em um documento público é sempre estúpido e inútil. Ele também quebra a navegação (por exemplo, você não pode simplesmente abri-lo em uma nova aba ...).

    
por 04.02.2014 / 13:11
0

Obrigado ao @ sebcap26 por me apontar na direção certa.

Eu acho que a solução é:

wget http://www.oml.ago.state.ma.us/default.aspx --post-data="parameters"
    
por 04.02.2014 / 13:30