Signpost  
 
How Signpost works
signpost home-page
member home-page
Member Logout

Signpost is a complex system comprising several integrated software components. Each component has an overall function such as retrieving web page data, searching search resources, or (in the case of the main submission engine), submitting to the resource's various forms.

Each component bases its actions on data retrieved from Signpost's database, and in turn updates the database, thus modifying the actions of the other components.

Here are the main stages involved in the submission process:

Spidering the web page

One of the first things Signpost does is to spider the web page of the submitted URL. There are several reasons for this. First of all Signpost needs to determine whether the page has been updated since the previous spidering. It also needs to ensure that the page is still valid and that there are no HTTP errors. Finally, as Signpost is also a search engine, it needs to index all the information it retrieves from the page.

Searching for the URL

If this is the initial submission run for this site, the search component of Signpost will search for it at the target resource. If the site is found, Signpost will not attempt to submit it to that resource, at this point. If however Signpost's spider subsequently discovers that the web page has been updated since its initial spidering, then it will be submitted to the search engine spiders. The search frequency for non-indexed submissions is once per week, and for indexed submissions, once every two weeks.

Making the submissions

When Signpost makes a submission, it is in effect mimicking the action of the resource's submission forms. That is, it generates exactly the same data that the form generates when the 'Submit' button is pressed, even down to the HTTP headers. If you check out the submission forms at the resources Signpost submits to, you'll see how widely they differ, some requiring just a URL and e-mail address, others asking for keywords and descriptions, as well as many hidden fields. Signpost is able to mimic them all, and can even detect and store any edit passwords which may be returned.

Handling the responses

To determine whether a submission has been successful or not, Signpost has to do what you yourself would do had you submitted your site manually. It reads the responses. It can detect successful submissions, duplicate submissions, and of course failed submissions (in which case it stores the response for administration purposes).

Maintenance

Any software system which interacts closely with other independent systems will always require a high level of maintenance, and Signpost is no exception. Search engines can change their submission forms and CGI scripts at any time and Signpost's database needs to be updated as soon as possible to reflect those changes.

When any of Signpost's components is active, it creates comprehensive log files recording its actions (or those of the search resources), in easy to read HTML format. This allows us to make any amendments which may be required, quickly and easily. Signpost also does some maintenance itself by checking the submission forms of the various resources before submitting to them. If it detects changes or problems it will disable the resource for that submission run, and send an e-mail to the administrators alerting them to the fact.

   
search directory help contact us back to top