| When
you're building a new website or completely renovating an
old one, it's important to create your design in a search
engine friendly way. The choices you make are going to be
with you for a long time and errors will be very time-consuming
to repair at a later stage.
In other parts of this site, we've looked at how to make
individual pages rank well. Now, let's focus on website
optimization and examine your site as a whole. We'll go
over the design techniques and principles that the search
engines like, but we'll also take a brief glimpse at some
potential pitfalls. Welcome aboard, I hope you enjoy the
trip!
Use as much text as possible
When the World Wide Web was born in the early 1990's, it
was mainly a text-based medium. Sounds, images and complex
animations were either very rare or completely unheard of.
Not surprisingly, the first major search engines that came
around a couple years later were built to classify and rank
WWW pages largely based on textual content. After all, the
WWW consisted of text and would continue to do so for the
foreseeable future, right?
Towards the late 1990's, the web had started to change.
Although the role of text was still very important, it was
now common for web pages to contain large images, Flash
animations and other bells and whistles. However, due to
numerous technical difficulties, the search engines were
unable to widen their reach beyond the world of text. While
search engines that specifically search for images have
been created, general-purpose engines still mostly ignore
everything that is not in text.
The moral of the story is, unless your pages are built to
contain a lot of text, they're unlikely to do well in most
search engines. This doesn't mean that you should drop all
the images from your website, but keep in mind that as far
as the search engines are concerned, images, Flash animation
and sounds do not exist.
Keep non-HTML code in external
files
Many of today's sites use JavaScript, CSS, or both in their
designs. Some of them have quite a lot of code in these
languages on each of their pages and have placed it above
the HTML containing the text used on the page. In terms
of website optimization, this is a bad idea.
First of all, it forces the spider to wade through something
that it is not at all interested in before being able to
read the text. While modern spiders are probably quite well-accustomed
to such unfriendly pages, it's safe to say that filling
your pages with non-HTML code is more likely to hurt than
to help you.
Second, the less the search engine knows what kind of CSS
and JavaScript you use, the better. If your code is attached
to the HTML, search engine spiders can freely read and analyze
it if they want to. On the other hand, if you place your
code in external files and use a robots.txt file to forbid
search engines from downloading them, your code is fairly
secure. Of course the search engines could still get it
if they wanted to, but then they would have to both disobey
your robots.txt and grab the .css or .js file, both things
that they're unlikely to do.
But why would you want to keep your CSS and JavaScript away
from the eyes of the search engines if you're not doing
anything wrong? Well, the problem is that search engines
define what is acceptable and what is not, and it often
seems like they have a lot of trouble making up their minds.
For example, using a JavaScript redirect is occasionally
"OK, if you have a legitimate reason for doing it"
and occasionally "spamming, and we'll skin you from
head to toe if we catch you". The point is that it's
better to be safe than sorry, because the rules change all
the time.
Frames or tables - or CSS?
The layout of your website and the way it is created is
another factor that can either boost or reduce your search
engine success. Here at the APG site, I've decided to use
a table-based layout, which is usually considered something
both human visitors and search engines can appreciate. However,
it is not the only method available and all of them have
their pro's and con's.
Tables
Search engines generally don't have any trouble reading
a table-based page, provided that the layout is not overly
complex or incorrectly designed. The only serious problem
arises if you wish to have a navigation menu on the left
side of the screen, just like I do. Placing the menu on
the left causes its contents to be displayed above the rest
of the content on the page in your source code. Humans won't
mind about that, but because search engines read your source
code rather than what you see on the screen, this kind of
arrangement may damage your ranking in them.
You see, most search engines consider the text at the very
top of the page to be more important than the one at the
middle. This sounds a bit odd, but it's actually a very
reasonable assumption. Take a look at some of the pages
on this site for example; if you begin reading from the
top, it won't take long before you've got a general idea
about the contents of the page. But if you start from the
middle, it will take on average substantially longer to
determine what subject is being discussed.
So, if your menu pushes the actual content of your page
downwards in your source code, the search engine will have
difficulty determining what your page is about, which might
cause your ranking to drop. However, fortunately there is
a solution to this problem that allows you to use tables,
keep your menu on the left and please the search engines
at the same time.
Frames
Some like them, some hate them. Think of them what you will,
but generally frames are not as search engine friendly as
tables. That is not to say that its impossible to build
a site that uses frames and does well in the engines, it
is just harder to do than with tables.
If you already have a site that uses frames, or if you just
are determined to use them, it would be a good idea to implement
a few website optimization tricks to prevent some of the
most common problems.
To begin with, use a <NOFRAMES> tag on your frameset
page. In it, have a simplified version (less graphics, no
Flash, no JavaScripts etc.) of the content page your frameset
points to and links to all of your other content pages.
By having a good NOFRAMES tag, you'll make it easier for
the search engines that can't read framesets to index your
pages. As an added bonus, the NOFRAMES tag enables those
who are using browsers that can't read frames to access
your site.
However, there's another serious problem caused by frames
that can't be solved with the NOFRAMES tag. Usually, a typical
design that uses frames has the site navigation in one frame
and the content in another. After submitting your content
pages to the search engines, they will eventually be indexed
and hopefully start receiving visitors. The trouble is that
when someone arrives directly to one of the content pages,
the navigation frame will not load. This can deter visitors
from venturing further to your site and thus reduce the
usefulness of the traffic sent to you by the search engines.
While this is a difficult situation, there are things you
can do to correct it. The simplest of them is to install
the following JavaScript to all of your content pages:
<script type="text/javascript" language="javascript">
<!--
if (top == self) location.replace("FILENAME OF YOUR
FRAMESET PAGE");
-->
</script>
As long as you remember to place the name of your frameset
page into the script, you can get it to work simply by cut
'n pasting it to between the <HEAD> and </HEAD>
tags in your HTML. However, as mentioned above, it would
be best to spend some extra time and place the script in
an external file instead.
So, what will the script do? Quite simply, it'll check whether
the frameset is loaded and if not, it will load it. This
will give the visitors who arrive directly to your content
pages the opportunity to see your navigation menu and thus
browse your site. Sounds great, right?
Unfortunately, the script is not as good as it seems. If
you point it to your entry frameset page, you'll notice
that while it loads the navigation, it will also load your
homepage. You've given the visitor a possibility to navigate
your site, but in turn, you're redirecting him to a page
that might be completely different from the one he found
in the search engine. This is in my opinion better than
doing nothing, but it is still a very unsatisfactory solution.
Luckily, there are some more refined ways of handling
the issue with JavaScript. They'll require a bit more
effort and skill, but can deliver both the navigation menu
and the correct page to the user at the same time. While
these scripts have their own problems, such as not being
100% valid HTML code, they're far superior to any other
solutions I've seen. So, if you're using frames and want
to offer a satisfying experience to those of your users
who arrive through the search engines, using them instead
of that simple script I showed you is really the way to
go.
To sum it up, by implementing the above suggestions, you
can create frame-based sites that get along with search
engines a lot better than they would normally do. They won't
be perfect, but what in this world really is?
Cascading Style Sheets
Search engine-wise, using CSS to create your layout is probably
the best possible solution. In addition to being more flexible
than frames and tables, CSS also gives you the possibility
to easily arrange your source code. This is a helpful ability,
because you can use it to ensure that the spiders always
read the most important and well-optimized content on the
page first without having to make changes to the layout
itself.
Even though it has many excellent properties, it feels like
a CSS layout is a bit ahead of its time at the moment. While
it is completely possible to implement, it will cause problems
with older browsers, for example with Netscape Navigator
4. CSS is likely to ultimately become the layout method
of choice, but for now it is still better to stick with
tables.
Avoid non-HTML filetypes
Due to the great success of Adobe's Acrobat and Microsoft's
Word and Excel, many sites now make parts of their content
available in files created with these programs. While this
may be the fastest and easiest way to post content on the
Web, it can make getting your information listed on the
search engines very difficult.
Although the search engines are continuously becoming better
in their task of finding and indexing information, most
of them can't read .PDF (Acrobat), .DOC (Word) or .XLS (Excel)
files. Google is ahead of the rest in this area, as it supports
all of these filetypes. Another major player, FAST, is able
to index .PDF's, but not Word or Excel documents. If you
want your file to be found on the rest of the engines, you're
going to have to stick with HTML.
However, it must also be noted that even plain old HTML
pages may cause trouble with search engines if they are
generated dynamically, for example with a CGI script. There
are several good ways of taking care of these problems without
having to sacrifice the flexibility of generating HTML dynamically,
but it's important to be aware that they do exist.
Conclusion
In order to get your pages listed at the search engines
and get them to rank well, you'll have to do more than just
add META tags and get a couple of links to point to your
site. By designing and constructing your site correctly,
you're building a solid foundation on which is it possible
to apply various optimization techniques in the future.
Changing an existing site structure to one that works better
with the search engines can feel like a large task, and
it often is one. However, if you're planning to make improvements,
it's better to start your website optimization project as
quickly as possible. Sites tend to become larger and more
complex with age, so the job is unlikely to get any smaller
as time passes.
|