Methods and systems for a sitemap generating client for web crawlers are described. The client accesses one or more sources of document information about the documents available on a website, such as the file system, access logs, or pre-made URL lists. Document information is extracted from the sources...http://www.google.co.uk/patents/US7801881?utm_source=gb-gplus-sharePatent US7801881 - Sitemap generating client for web crawler