HTML: Spider website
Karol Janyst (contact me)
This class can be used to crawl a site and retrieve the the URL of all links.
It can retrieve a page of a site and follow all links recursively to retrieve all the site URLs.
The class can restrict the crawling to URLs with a given extension and avoids accessing pages listed in the site robots.txt file, or pages set with the no index or no follow meta tags.
| Utility
| Consistency
| Docs
| Examples
| Tests
| Videos
| Overall
| Rank
|
| Good (83.3%)
| Perfect (100.0%)
| -
| Good (91.7%)
| -
| -
| Sufficient (60.8%)
| 600 |
Click here for detailed information about this class on phpclasses.org