Why design for spiders?
To understand why you need to design your pages to work with spiders, you
should understand a little about how search engines work. When you search at a search engine, the software has to
determine which of its indexed pages are relevant to your query. Furthermore, it has to list them in order of
relevance, so that those at the beginning of the list will be more relevant than those at the end.To do this it
needs data from your web page (the informational content, not the HTML) and it is this data which is provided by the
spider. In general a spider retrieves words from between title tags, body tags, alt tags, and meta tags, and indexes
them recording exactly where in the page they were found. This positional information (together with other data) will
determine the result of searches.
If the spider cannot find appropriate information on your web page, it will have a
direct effect on whether your page is returned as the result of a search.
What the search engine does
The search engine software uses a special technique to decide web page
relevancy for a given search. This is usually based on both the location of a word in the page, as well as its
relative frequency. However, each search engine uses its own variant of the basic 'location/frequency' method,
often incorporating other factors which can influence the results. This is one of the reasons why the same search term
can return different results at different search engines.
Simply put, the location/frequency method assumes that, if
your web page is about stamp collecting, then the words 'stamp' and 'collecting' will appear near the top of the page.
It also assumes that those words will appear several times in the document.
This is what it attempts to measure. It assigns each keyword a value or 'weighting' to reflect its relevance to
the search term, also taking into account the frequency with which the keyword appears (relative to the total
number of words) and the resulting total is used to order the returned list of URLs. Pages with the highest
weighting appear first.
It is vital to understand that it is the relative frequency of the keyword which is
important. A keyword appearing in a web page which has 20 words in total represents 5% of those words, whereas
if the page contained 200 words, the keyword would represent just 0.5%. It follows then, that adding more words
to a page simply 'dilutes' the relevancy of the existing ones. Most search engines now also check for 'keyword spamming'.
This is where a keyword is repeated many times in a web page in the belief that it will boost its relevancy ranking.
It won't. In fact the overall ranking will be reduced as a penalty for attempting to cheat.
See the keywords section below for more information.
You can assess the results of the location/frequency algorithm by using Signpost's own
search engine.
There are various options to limit the search to particular sections of a web page. For instance, typing in the following:
title: business
would perform the location/frequency method only on words within the title tags of each document.
General Tips
If you use frames...
You need to bear in mind that most search engine spiders will only 'see' your master frames document
(the one with the frameset tags), so you need to ensure that spiders will find what they are looking
for in it.
This includes a good title, meta_tags, and a descriptive paragraph (this should be placed
within the 'body' tags of the 'noframes' section of your master frames document).
All the points below
also apply to master frames documents.
Use title tags to full effect
We talked about the location of a word on a web page as being very important. As the title tags are
situated at the top of a web page, you must use them effectively. Here is an example of a poor title:
<title>Home Page</title>
and an effective one:
<title>The Movie Site: The Latest Movie News, Previews, Merchandise, and Rumours</title>
Notice that the first uses two of the commonest words found on web pages, 'Home' and 'Page'. So common
in fact that most search engines will ignore them as single search words (stop words), and searching for the phrase "home page
" will return thousands or even millions of irrelevant hits.
The second example is not only descriptive, it also contains keywords relevant to the theme of the web page.
Don't try to 'stuff' hundreds of keywords between the title tags
in the hope of increasing the number of search hits. This will simply decrease the value of all of your
keywords. Even worse is an empty title tag. If a page with an empty title tag is returned as the result
of a search, its link will usually be displayed as 'No Title', which is unlikely to induce anyone to follow it.
Ensure the first paragraph on the page is descriptive and contains appropriate keywords
Some spiders will only index the first few hundred words on a page, and coupled with the keyword
location factor it is vital you make good use of this paragraph.
Choose your keywords wisely
See the keywords section, or use the help facility (Design guide) for more information.
Try to keep the content of your page strongly themed
If your page contains many diverse topics it will consequently contain more words. In effect this
means it will be competing against many more web pages than if its content was centred around a single theme.
For example, if the theme of a web page is 'stamp collecting', this means it will be competing against
all other stamp collector pages. If however the theme is 'collecting British stamps' then the number
of competing pages will be more limited to British themed pages. This of course assumes that the keywords
chosen for the page reflect the theme. See the keywords section for more information.
Always use meta-tags
Meta-tags provide a mechanism to provide spiders with descriptive information about an HTML page.
See the meta tags section.
Always use alt tags
Most spiders will index alt tags so make sure each image on your page has a descriptive one. The format is
<img src="penny_black.jpg" width="150" height="200" alt="Penny Black photograph">
Meta-tags And Their Use
Meta-tags provide a mechanism for supplying information about a page without affecting affect its appearance.
There are many types of meta-tag but it's the 'description' and 'keywords' tags which are important for search engines.
These are the two tags used to summarize what your site is about (description) and what keywords are relevant to search
queries. Most of the major search engines index meta-tags, although what weight they give them varies from engine to
engine.
Meta-tags should be placed within the 'head' tags of your web page and should have the following format:
<META NAME="description" CONTENT="Everything you wanted to know about subatomic physics.">
<META NAME ="keywords" CONTENT="subatomic, physics, lepton, hadron, boson, quark, particle, accelerator">
These tags are important because they offer a means of supplying search engine spiders with relevant keywords when, for one
reason or another, they don't appear elsewhere on a page. For example, if the home page to your web site contains a
graphical link and no plain text then meta-tags will be the only place where the spider will find any meaningful
text.
They are also useful because they give you some control over how your page will be displayed in search engine
listings. The content of the description meta-tag will be displayed as the description of the page. Without the
description meta-tag the search engine will display the first few lines of text that the spider found on your page,
and this can often be inappropriate.
Some search engines may include meta-tags in their relevancy algorithms, giving
a boost to pages which contain them. However, as useful as meta-tags are, don't assume they guarantee a top-ten listing.
That still depends mainly on the overall design of your web page, and how well you've chosen your keywords.
Effective Use of Keywords
The keywords you choose to appear on your web page will determine how easy it is to find in search engines, and
consequently how successful your page will be. You need to ask the following question:
When someone searches for a site like mine, what would they type into a search engine?
The answer obviously depends on the theme of the web page and requires some imagination. If your page is about
stamp collecting then the phrase "stamp collecting" might be a suitable choice for your 'strategic keywords'.
However, if your page is about collecting British stamps in particular then "british stamp collecting"
would be a better choice.
Then if someone typed in either "stamp collecting" OR "british stamp collecting" your page
would appear in both search results. The difference would be that the first term would return a lot more hits than
the second one, therefore a lot more competition. Running these actual searches at
AltaVista returned 22,778 pages for the search term "stamp collecting", and just 5 for the term
"british stamp collecting".
It follows that your strategic keywords should comprise at least 2 words to be most effective. You should try to
place them in the title tags, the first paragraph on your page, the meta_tags (see the meta_tags
section), and elsewhere in the rest of the page.
Don't overuse them though or the search engines might consider it spamming
and reduce your page's relevancy ranking.
In general if your text reads sensibly and doesn't seem stilted or contrived then it should be OK.