A method (100) of crawling the Web (620) is disclosed. The method (100) crawls (120) Web pages on the Web starting from a given (110) set of seed Universal Resource Locators (URLs). Crawled Web pages are partitioned (140) into sets of relevant and irrelevant pages. A set of exclusion and/or inclusion...http://www.google.com/patents/US7379932?utm_source=gb-gplus-sharePatent US7379932 - System and a method for focused re-crawling of Web sites