XWebcomber Documentation
XWebcomber is a search utility for the world-wide web.
It is not
designed to be a general purpose search utility of the entire web --
that is better done with the available search engines such as Lycos
(http://lycos.cs.cmu.edu).
XWebcomber is designed to search a
limited tree for a specific item. As an example, the webcomber will not
find every occurence of "Pentium" on the net, but will allow you to
locate the Pentium specific pages on the Intel, Corp. web server.
It is a "personal" web agent and tries to be a good web citizen.
Usage
- Enter the starting point of the search in the
URL to start search:
text box. This must be a complete URL.
- Enter the search items in the
Words to search for: text box. The
webcomber will match any of the items in the list.
- Choose a depth for the search.
- Click on Search.
The webcomber will then begin a breadth-first seach
of the tree rooted at the starting page provided. The depth of the
search will be the number of levels specified, with the root of the tree
being the first level.
Once done, the webcomber will present a short
report of all the pages that were matched. The webcomber also will write
an HTML version of the search report, and will update an index to past
searches. These files can be found under the user's homedirectory, in the
webcomber subdirectory.
With any web browser you can load the webcomber-index.html file,
which will detail the starting point and a
pointer to a list of matches for all webcomber searches, latest search
first. You should load the index page with the Open
File.. command in your browser, and then save the
location to your bookmarks or hotlist.
Clicking on the search term in this page loads a second HTML page
with all the matches for this search, as HTML links. Next to each each link is
the number of matches found on that page.
A list of past starting points is maintained in the webcomber
window. Clicking the left button on a page name selects its URL as the
starting search point. Clicking the right button once a page name is
selected allows one to delete a URL from this list. A dialog box will
ask for confirmation before the URL is deleted. The list of starting
points is maintained in the .history file located in the webcomber
directory.
Being a Good Web Citizen
There is some debate on the automated searching of the web. Automated
searchers retrieve pages from servers faster than people do, thus eating
network bandwidth and server resources.
XWebcomber tries to be a good network citizen and minimize its impact
on net resources. This is done in several ways:
- XWebcomber only retreives HTML
pages. It does not load any images, nor video, sound, or other binary
data.
- For a given search XWebcomber will not load the same web page
more than once. Circular references are no problem.
- XWebcomber limits the depth of the search. The program will only
look a limited number of links away from the starting URL.
- Finally, XWebcomber passes the User-Agent and From fields to the
webservers. If any web site is burdened by XWebcomber, the webmaster
can restrict its access.
If you like Webcomber...
Webcomber is shareware. Right now, the price is some email telling us
that you like it.
There is a plan to do a Windows95 version, we'll see.
XWebcomber was written by Aaron Michael Cohen
(rgut@aware.com.