If your business has been around for a while and has a Web site, your site is overwhelmingly likely to be indexed by all the leading search engines. It is rare for corporate sites to be completely missing from search indexes, although it is quite common for individual pages from the site to be missing. (We show you how to figure that out later in this chapter.)
One way to tell whether your site is indexed is to search for it and see whether it is found. (Yeah, we figured you thought of that one.) If your company has a common name (AAA Plumbing), you might want to search for more than just the name ("aaa plumbing syracuse"). It's common for site owners to panic when their sites are not shown by the search engine for these navigational queries for company names. It's easy to jump to the conclusion that the entire site is not indexed, but that is rarely the case.
You can also use a search toolbar in your browser to check to see whether your pages are indexed. If you use the Google toolbar (or one for another search engine), you can navigate to your home page and take a look at the toolbarmost toolbars indicate that the page is indexed in some way. Figure 10-1 shows how Google's toolbar does it. You can see whether your site is indexed by that toolbar's search engine, although it does not help you figure out whether your page is indexed by other search engines. (We show you how to do that later in the chapter.)
Nearly all corporate sites have their home pages (and at least a few other pages) in the leading search indexes, but if somehow you do not, read on. It is unlikely but possible that your site is missing in actionit has happened, even to relatively large companies. Or perhaps your search marketing program's scope does not cover your whole company, but all the pages within your scope are missing from search engines. In that case, you need to ask a few questions:
Is your site banned by the search engines? Search engines have very specific rules for being included in their index. Sites that violate these rules might find all of their pages removed from the search index.
Is the spider visiting your site? Your pages cannot be indexed if the spider never comes. Check to make sure.
Are other sites linking to yours? Spiders find your site by following links from other sites, so you must verify that your site is linked into the larger Web.
If you can find at least a few pages of your site when you perform searches in the major search engines, you can skip ahead to the next section to determine how many pages you have indexed. If your site is not found at all, however, you can explore these questions to solve the mystery of the missing site.
The most difficult situation occurs when one or more search engines have banned or penalized your site. If your site is well represented in some search engines, but completely missing in others, your site might be banned. Sites are banned when search engines detect that those sites are trying to "fool" the search engine to rank that site's pages more highly than they deserve.spam techniques that try to fool the search engine, or you might have unwittingly violated a search engine's guidelines. (You can look at Google's guidelines at Chapter 15, "Make Search Marketing Operational"), and you should investigate further if you see the following:Chapter 13, "Attract Links to Your Site," we show you easy ways to check the number of links to your site that are stored by each search engine. Your home page can be found only by a direct search on the URLinformational queries for words on the page do not seem to work anymore. If you suspect a problem, you first need to diagnose the cause. In the next section, we discuss a spam technique called cloaking. We cover doorway pages and other stupid content tricks in Chapter 12, "Optimize Your Content," and link farms in Chapter 13, "Attract Links to Your Site." These are the most common spam techniques. If your site has been banned or penalized for using these techniques, you can clean up your site and request reinstatement, which is usually granted (although reinstatement sometimes requires an extended period of explanation and begging). Make Sure the Spider Is VisitingIf the spiders are not coming to your site, your pages cannot be indexed and your site will not be found by organic searchers. Your Webmaster can help you check your Web servers' log files to see which search spiders have been visiting your site. (Most Web servers are configured to log search spider visits, but some servers might need to be adjusted by your Webmaster to capture this important information.) user agent that accessed the page noted, which indicates the software that was used to see the page. Figure 10-2. Spotting spider activity. Carefully examining your log files can prove that spiders are visiting your site.cloaking or IP delivery, they use a high-tech version of the old "bait-and switch" scam. Here's how it works. The spammer sets up a URL served by a program dynamically and waits for someone to request it. When the request is received, the user agent name and its IP address are checked. If a browser is making the request (with Mozilla in the name, for example), the program returns the page that human visitors should see. If it is a search engine's spider, however, the program sends back a page full of keywords designed to attain a high search ranking. Later in this chapter, we discuss situations where you can legitimately use IP delivery techniques, but using this technique to fool a search engine about what visitors see on the page is clearly spam and the search engines will deal with it harshly. Unless you know that what you are doing is acceptable, cloaking is a very dangerous game that can get your site banned. Cloaking can bring quick rankings, but when competitors see what you are doing, they will complain to the search engines and shut you down. |
Each search engine spider has its own frequency for returning to your Web site. Spiders return to a typical Web site at least once a month, but popular corporate Web sites might be revisited weekly, or even daily. By analyzing how frequently spiders crawl your pages, and which pages they check the most, you will know how quickly content changes on your site will be reflected in search indexes.
Although it is rare to find a site that the spiders do not visit at all, it can happen for a few reasons:
Your site is not linked. If you have a brand new site that is not linked to by any other site in the spider's path, you will not get any spider visits. If the search engine spider does not know your site exists, it obviously cannot visit.
Links to your site are not effective. Some links cannot be followed by spiders, for many reasons that we get into later in this chapter. Or the links to your site are from sites that themselves are not crawled by spiders, perhaps because they are also new, or possibly because they are banned for using unethical techniques. You can use the Google toolbar to check the PageRank of sites that link to yoursthat shows you how valuable a link you are getting. If your linker has a zero PageRank, it is not doing you any good in Google because that site is not indexed.
The spider has given up. Perhaps the spider was visiting your site at one time, but your site was blocked so that the spider could not index any pages in the search index. After a few months of fruitless visits, spiders sometimes permanently stop visiting.
If your site truly is not being visited by spiders, the remedy depends on which of the above reasons is the cause. If your site is not linked or links to your site are not effective, the best way to get the spider to visit is to make sure other well-respected sites link to yours (explained in more detail later). If the spider has given up, first remove the spider trap (also explained later), and then you should manually submit your site to the search engines.
Search engines vastly prefer to find new sites by following links because analyzing link patterns is one of the ways engines judge relevance; if your site is linked and spiders are not visiting, however, you should manually submit your home page's URL. (If you create a new site but are too impatient to wait until someone links to you, you can also submit, but waiting to be found from a link will help your pages rank higher.)Chapter 12.
Before you get excited about submitting all your Web pages to the search engines, a word of caution is in order. Many "experts" will advise you to submit your site early and often, and to submit many pages from your site. Don't. It is more complicated than that.Chapter 3, "How Search Marketing Works," paid inclusion not only guarantees to keep your pages in the index, it also promises that they will be revisited by the spider regularly. Only Yahoo! (of the major worldwide engines) is offering paid inclusion, but several search engines offered it a few years ago, so the trend might change yet again. Fees are typically charged for each page included and for every time a searcher clicks your page. Remember that paid inclusion does not guarantee your page will be shown by the search engineonly that it is in the index to be found. Later in this chapter, we look at paid inclusion in detail.
As we have emphasized, the best way to get indexed is through a link from another site (one that is already indexed itself). If you have a well-established site, you have probably already attracted many links, but a new site obviously does not have any.campaign to attract links. Chapter 13 is devoted to attracting more links to your site, an important subject whether it is new or well known.