Path: usenet.cise.ufl.edu!usenet.eel.ufl.edu!arclight.uoregon.edu!feed1.news.erols.com!howland.erols.net!news.sprintlink.net!news-peer.sprintlink.net!uunet!in2.uu.net!crusty.teleport.com!nntp0.teleport.com!usenet
From:
[email protected] (Steffen Beyer)
Newsgroups: comp.lang.perl.announce,comp.lang.perl.misc
Subject: P.S.: ANNOUNCE gen_tree vers. 2.1
Followup-To: comp.lang.perl.misc
Date: 25 Oct 1996 00:57:05 GMT
Organization: sd&m GmbH & Co. KG Munich, Germany
Lines: 94
Approved:
[email protected] (comp.lang.perl.announce)
Message-ID: <
[email protected]>
Reply-To:
[email protected] (Steffen Beyer)
NNTP-Posting-Host: gadget.cscaper.com
X-Disclaimer: The "Approved" header verifies header information for article transmission and does not imply approval of content.
Xref: usenet.cise.ufl.edu comp.lang.perl.announce:42 comp.lang.perl.misc:6330
Dear Perl and WWW hackers,
some people complained that I forgot to say in my announcement posted
recently ("ANNOUNCE: gen_tree version 2.1") what the script was doing.
Of course they are right, and I apologize for that!
Here comes the missing part:
What does it do:
----------------
This script scans the tree (better: the directed graph) of HTML pages
of a web site. (It's not always a tree because circles and loops are
possible!)
It starts at the home page of that site (called the "root page" here)
and follows all hyperlinks in a recursive descent (width first, in
order to produce a representation in the expected way).
(You can also scan just a subtree of your web site if you want)
Circles and loops are recognized through unique identification of each
page by the device and inode numbers of its corresponding file.
When scanning of the web site is complete, an HTML page is generated
which contains all the pages found in form of one hyperlink to each
of them.
The tree structure of the web site is reflected in this page by the
indentation of these hyperlinks.
The text which is displayed in these hyperlinks is extracted from the
<TITLE> ... </TITLE> tags inside the corresponding page.
Supported features:
-------------------
This script is capable of executing server side includes and of analyzing
server side image maps.
This way, no important hyperlinks are missed. (Many home pages consist of
an image map and nothing else!)
It is also able to analyze CGI scripts simply by calling them and analyzing
their output. (Therefore, no HTTP server is needed!)
While the web site is being scanned, a detailed log file is written. Most
of the time, it's a good idea to read it because it lets you discover
flaws in your web site that often go unnoticed otherwise!
I hope this explains it!
Now for those who missed the first posting, here comes the other part again:
Where to get it:
----------------
You can download this script from the following address:
http://www.sdm.de/e/www/hilfe/gen_tree-2.1.tar.gz
Or download it from any CPAN (= "Comprehensive Perl Archive Network") ftp
server near you (see "The Perl 5 Module List" in news:comp.lang.perl.modules
for a list of CPAN ftp servers):
ftp://..../..../CPAN/authors/id/STBEY/gen_tree-2.1.tar.gz
What's new in version 2.1:
--------------------------
It is now possible to have greater control over which pages are shown and
where (the latter when several links exist to the same page(s)) by speci-
fying the root page of any given subtree and the number of (topmost) levels
to be shown of that subtree, i.e. "0" for hiding the subtree completely,
"1" for showing the root (topmost) page of that subtree only, "2" for
showing the root page and the pages it contains hyperlinks to, and so on.
(The number limits the maximum depth of hyperlinks to follow)
Thanks:
-------
Special thanks to Michael Bruns <
[email protected]> for suggesting
the finer tuning capability for excluding pages and subtrees realized in
version 2.1!
Yours,
--
Steffen Beyer ________________________ C:\ONGRATLN.W95 _______________________
mailto:
[email protected] |s |d &|m | software design & management GmbH&Co.KG
phone: +49 89 63812-244 | | | | Thomas-Dehler-Str. 27
fax: +49 89 63812-150 | | | | 81737 Munich, Germany.