Category:
Created:
Updated:
there is an example which shows that crawling of 1/4 billion of webpages is possible in 2 days, WHEN:
"More precisely, I crawled 250,113,669 pages for just under 580 dollars in 39 hours and 25 minutes, using 20 Amazon EC2 machine instances."
Instance |
vCPU |
Arbeitsspeicher (GiB) |
Speicher |
Netzwerkleistung (Gbit/s) |
a1.xlarge |
4 |
8 |
Nur EBS |
Bis zu 10 |
with 80 vCPU and 160GB RAM and 500 gigabytes of outgoing bandwidth through the HTTP-requests, 1.69 Terabytes of downloaded content and 2800 agents
"According to this presentation by Googler Jeff Dean, as of November 2010 Google was indexing “tens of billions of pagesâ€. "