Lanterna só transmitiria um subconjunto específico de sites - É essencialmente UUCP reinventado, então a ideia que você tem que verificar os tamanhos que provavelmente não funcionarão, ou seja necessário. É um caminho para que você não tenha que se preocupar com o tamanho dos dados > é um meio broadcast como a TV ou rádio antiga, em vez de um meio bidirecional como a internet. Então, a principal razão pela qual você está perguntando sobre isso ... bem, não será um problema. A Wikipedia resume a nossa web muito bem
É um problema interessante, então vou dar uma chance
A única maneira que posso pensar, que pode funcionar com algumas páginas da Web é usar wget (com --spider para que você não baixe a página e --server-response para obter o tamanho de arquivo relatado). Esta abordagem foi strongmente inspirada por este Então pergunta
Isso funciona com o dizer superuser.com
[geek@phoebe os store]$ wget -v4 --spider --server-response superuser.com
Spider mode enabled. Check if remote file exists.
--2014-11-28 17:26:35-- http://superuser.com/
Resolving superuser.com (superuser.com)... 198.252.206.16
Connecting to superuser.com (superuser.com)|198.252.206.16|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Cache-Control: public, no-cache="Set-Cookie", max-age=60
Content-Length: 71913
Content-Type: text/html; charset=utf-8
Expires: Fri, 28 Nov 2014 09:27:35 GMT
Last-Modified: Fri, 28 Nov 2014 09:26:35 GMT
Vary: *
X-Frame-Options: SAMEORIGIN
Set-Cookie: prov=85f6f157-7e84-43bf-b762-003cf7d8ff71; domain=.superuser.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
Date: Fri, 28 Nov 2014 09:26:34 GMT
Length: 71913 (70K) [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.
[geek@phoebe os store]$ wget -v4 --spider --server-response http://superuser.com/questions/845893/is-it-possible-to-determine-through-the-internet-or-google-how-large-a-website/845895#845895
Spider mode enabled. Check if remote file exists.
--2014-11-28 17:26:43-- http://superuser.com/questions/845893/is-it-possible-to-determine-through-the-internet-or-google-how-large-a-website/845895
Resolving superuser.com (superuser.com)... 198.252.206.16
Connecting to superuser.com (superuser.com)|198.252.206.16|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Cache-Control: public, no-cache="Set-Cookie", max-age=60
Content-Length: 69163
Content-Type: text/html; charset=utf-8
Expires: Fri, 28 Nov 2014 09:27:43 GMT
Last-Modified: Fri, 28 Nov 2014 09:26:43 GMT
Vary: *
X-Frame-Options: SAMEORIGIN
Set-Cookie: prov=7d270174-a377-4758-bbff-f4c87054de67; domain=.superuser.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
Date: Fri, 28 Nov 2014 09:26:42 GMT
Length: 69163 (68K) [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.
Mas não, digamos
Google.com
[geek@phoebe os store]$ wget -v4 --spider --server-response google.com
Spider mode enabled. Check if remote file exists.
--2014-11-28 17:29:06-- http://google.com/
Resolving google.com (google.com)... 74.125.68.113, 74.125.68.138, 74.125.68.100, ...
Connecting to google.com (google.com)|74.125.68.113|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.com.sg/?gfe_rd=cr&ei=YkB4VMT6F9iDoAO2tIH4Dw
Content-Length: 262
Date: Fri, 28 Nov 2014 09:29:06 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.02
Location: http://www.google.com.sg/?gfe_rd=cr&ei=YkB4VMT6F9iDoAO2tIH4Dw [following]
Spider mode enabled. Check if remote file exists.
--2014-11-28 17:29:06-- http://www.google.com.sg/?gfe_rd=cr&ei=YkB4VMT6F9iDoAO2tIH4Dw
Resolving www.google.com.sg (www.google.com.sg)... 74.125.68.94
Connecting to www.google.com.sg (www.google.com.sg)|74.125.68.94|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Fri, 28 Nov 2014 09:29:06 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=a1dfee7d97d41db1:FF=0:TM=1417166946:LM=1417166946:S=Uzy6MmaLU-UegGZU; expires=Sun, 27-Nov-2016 09:29:06 GMT; path=/; domain=.google.com.sg
Set-Cookie: NID=67=C_dkB1z4qdwwPkNMS80Ek1km-G4y716Evvh2BCEjYpdkpIJSAfXpjpTnSF496UlahPirO0Go-VhVxQjHlsEI_Hf4AxB9IfTyrGFzduyMB4rdTI-nX-kh0hlKhKQCrFg7; expires=Sat, 30-May-2015 09:29:06 GMT; path=/; domain=.google.com.sg; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic,p=0.02
Transfer-Encoding: chunked
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.