|
From Jim Gilliam's blog archives
Crawling Inefficiencies
April 18, 2003 2:07 AM
Thanks to Microdoc News for pointing out an error in my post yesterday about Grub's distributed crawling requiring twice as much bandwidth. The local client is able to identify if a page has changed, and not send that unchanged data back to Grub. That is definitely better bandwidth utilization than I expected. I assume it must be using a checksum of the file to determine that it's the same, otherwise it would have to send its copy of the file to the client.
Crawling Inefficiencies (04.18.2003)
Next Entry: Ashcroft's New Ally (04.18.2003) |
|