Citations
Referenced by
Claims1. An automated machine-implemented method for crawling dynamic content, the method comprising:
2. The method of claim 1, further comprising:
3. The method of claim 2, wherein the state information is a cookie. 4. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 2. 5. The method of claim 2, wherein the state information is a session identifier. 6. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 5. 7. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 2. 8. The method of claim 1, wherein the first page includes dynamically generated content. 9. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 8. 10. The method of claim 1, wherein:
11. The method of claim 10, wherein the first link includes form data responsive to the form. 12. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 11. 13. The method of claim 10, further comprising:
14. The method of claim 13, wherein sending the encoded form data to the server at which the first page is stored comprises sending the encoded form data set as part of a POST transaction. 15. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 14. 16. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 13. 17. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 10. 18. The method of claim 1, further comprising maintaining the first data and the second data in a database at a web crawler or a search engine. 19. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 18. 20. The method of claim 1, further comprising:
21. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 20. 22. The method of claim 1, wherein determining that the first prerequisite page must be visited prior to visiting the first page occurs in response to, during a first attempt to visit the first page, determining that the first link is a dead link. 23. The method of claim 22, wherein determining that the first link is a dead link occurs in response to receiving an error page during the first attempt to visit the first page. 24. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 23. 25. The method of claim 22, wherein determining that the first link is a dead link occurs in response to being redirected to a general page during the first attempt to visit the first page. 26. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 25. 27. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 22. 28. The method of claim 1, wherein each of the steps of claim 1 are performed by a web crawler. 29. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 28. 30. A volatile or non-volatile machine-readable storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform the method recited in claim 1. |