Webmirror 2.04
Webmirror is a Perl script that mirrors a whole web site. There was an earlier
version
of this program published on these pages, but it did not function properly and
therefore it was totally rewritten. The new version can be command line driven
compatible with the old version or using a retrieval definition file (RDF). The
recommended use is to create an RDF as many of the new features can only be utilized
this way.
The major features of the program (detailed):
- Recursive retrieval of web pages that fit a domain.
- Follow relative and absolute URL-s.
- Frame handling.
- Picture retrieval can be switched off.
- Retrieval of pictures from outside of the defined domain can be switched on.
- Retrieval domain can be defined to include and exclude patterns.
- Define maximal deepness of web pages to follow.
- Define maximal total size of retrieval.
- Limit the size of a single file.
- Basic authentication support.
- Multiple proxy definition.
- Multiple start pages.
- Configuration files can include eachother.
- Automatic or manual configuration of local net card usage.
- Detalied log file generation.
- Redirecting html page generation for pages not retrieved.
- Define user agent reported to the server.
- Automatic default file name creation (usually index.html).
- Cookie support.
- Object oriented development.
- Supports Windows NT, Windows 95 and UNIX operating systems.
- Software is GNU GPL.
The major features NOT implemented in the current version
but are planned in future versions:
- JavaScript support.
- Support for robot exclusion standard.
- Multithread downloading.
- Domain update (retrieve only files that are newer than that of last download).
- Limited support for POST operation.
- Limited support for ftp protocol.
What you can find in this package: