by Saniya

Doorblocks

Several pages in e-commerce and other functional sites are generated dynamically and have “?” or “&” sign in their dynamic URLs. These signs separate the CGI variables. While Google will crawl these pages, many other engines will not. One inconvenient solution is to develop static equivalent of the dynamic pages and have them on your site. Another way to avoid such dynamic URLs is to rewrite these URLs using a syntax that is accepted by the crawler and also understood as equivalent to the dynamic URL by the application server. The Amazon site shows dynamic URLs in such syntax. If you are using Apache web server, you can use Apache rewrite rules to enable this conversion.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Submit a “site map” page or a page with links to all inner pages

One good tip is that you should prepare a crawler page (or pages) and submit this to the search engines. This page should have no text or content except for links to all the important pages that you wished to be crawled. When the spider reaches this page it would crawl to all the links and would suck all the desired pages into its index. You can also break up the main crawler page into several smaller pages if the size becomes too large. The crawler shall not reject smaller pages, whereas larger pages may get bypassed if the crawler finds them too slow to be spidered.
You do not have to be concerned that the result may throw up this “site-map” page and would disappoint the visitor. This will not happen, as the “site-map” has no searchable content and will not get included in the results, rather all other pages would. We found the site wired.com had published hierarchical sets of crawler pages. The first crawler page lists all the category headlines, these links lead to a set of links with all story headlines, which in turn lead to the news stories.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Size of submitted page

We have written above that the spiders may bypass long and “difficult” pages. They would have their own time-out characteristics or other controls that help them come unstuck from such pages. So you do not want to have such a page become your “gateway” page. One tip is to keep the page size below 100 kb.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Submit only key pages

You do not have to submit all the pages of your site. As stated earlier, many sites have restrictions on the number of pages you submit. A key page or a page that has links to many inner pages is ideal, but you must submit some inner pages. This insures that even if the first page is missed, the crawler does get to access other pages and all the important pages through them. Submit your key 3 to 4 pages at least. Choose the ones that have the most relevant content and keywords to suit your target search string and verify that they link to other pages properly.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Incomnig mails bouncing

Hi,

I am not able to understand, why my incoming mails are bouncing.

Regards,

I think, you have turned on your or any other mailbox Discard All option.

Turn this Discard all incoming mail ON only if you are absolutely sure you don’t need ALL your incoming mail. You can also turn it on when you are going on vacation. Senders won’t receive ‘underlivered mail’ notices.

01.21.13

by Saniya

Working with Frames

Many websites make use of frames on their web pages. In some cases, more than two frames would be used on a single web page. The reason why most websites use frames is because each frame’s content has a different source. A master page known as “Frameset” controls the process of clubbing content from different sources into a single web page. Such frames make it easier for webmasters to club multiple sources into a single web page. This, however, has a huge disadvantage when it comes to Search Engines.
Some of the older Search Engines do not have the capability to read content from frames. These only crawl through the frameset instead of all the web pages. Consequently web pages with multiple frames are ignored by the spider. There are certain tags known as “NOFRAMES” (Information ignored by frames capable browser) that can be inserted in the HTML of these web pages. Spiders are able to read information within the NOFRAMES tags. Thus, Search Engines only see the Frameset. Moreover, there cannot be any links to other web pages in the NOFRAMES blocks. That means the search engines won’t crawl past the frameset, thus ignoring all the content rich web pages that are controlled by the frameset.
Hence, it is always advisable to have web pages without frames as these could easily make your website invisible to Search Engines.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Activate Catch All on Mailbox

Hi,

I want to activate catch all option on my email mailboxes.

Hi,

Get logged into your control panel, from the Quick Access menu, click on the e-mail icon. A new form will appear, click on MAIL BOX icon next to mail box where you want to enable Catch all option. In the right hand side of the form, you would be able to see the mailbox properties. Click on the rounded OFF button, which is a toggle button, by clicking it gets ON.

Hi,

How can I delete catch all mailbox?

To delete a Catch All mailbox, first switch Catch All OFF.

But, What is this Catch All mailbox option?

Catch All: if it’s on, any email messages sent to a nonexistent account on your domain will go to this address.
Example: your mailbox mailto:webmaster@1800ssl.com webmaster@1800ssl.comis marked as catch all. If someone sends an email to <amailto:support@1800ssl.com, support@1800ssl.com, which doesn’t exist, this particular message will arrive at webmaster@1800ssl. If no account were marked as catch all, this message would bounce back to the sender with an error notification.

01.21.13

by Saniya

Making frames visible to Search Engines

Many amateur web designers do not understand the drastic effects frames can have on search engine visibility. Such ignorance is augmented by the fact that some Search Engines such as AltaVista are actually frames capable. AltaVista spiders can crawl through frames and index all web pages of a website. However, this is only true for a few Search Engines.
The best solution as stated above is to avoid frames all together. If you still decide to use frames another remedy to this problem is using JavaScripts. JavaScripts can be added anywhere and are visible to Search Engines. These would enable spiders to crawl to other web pages, even if they do not recognize frames.
With a little trial and error, you can make your frame sites accessible to both types of search engines.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Using Robots.txt to your advantage

Sometimes we rank well on one engine for a particular keyphrase and assume that all search engines will like our pages, and hence we will rank well for that keyphrase on a number of engines. Unfortunately this is rarely the case. All the major search engines differ somewhat, so what’s get you ranked high on one engine may actually help to lower your ranking on another engine.
It is for this reason that some people like to optimize pages for each particular search engine. Usually these pages would only be slightly different but this slight difference could make all the difference when it comes to ranking high.
However because search engine spiders crawl through sites indexing every page it can find, it might come across your search engine specific optimizes pages and because they are very similar, the spider may think you are spamming it and will do one of two things, ban your site altogether or severely punish you in the form of lower rankings.
The solution is this case is to stop specific Search Engine spiders from indexing some of your web pages. This is done using a robots.txt file which resides on your webspace.

A Robots.txt file is a vital part of any webmasters battle against getting banned or punished by the search engines if he or she designs different pages for different search engine’s.
The robots.txt file is just a simple text file as the file extension suggests. It’s created using a simple text editor like notepad or WordPad, complicated word processors such as Microsoft Word will only corrupt the file.
You can insert certain code in this text file to make it work. This is how it can be done.
User-Agent: (Spider Name) Disallow: (File Name)
The User-Agent is the name of the search engines spider and Disallow is the name of the file that you don’t want that spider to index.
You have to start a new batch of code for each engine, but if you want to list multiply disallow files you can one under another. For example –
User-Agent: Slurp (Inktomi’s spider)
Disallow: xyz-gg.html Disallow: xyz-al.html
Disallow: xxyyzz-gg.html Disallow: xxyyzz-al.html
The above code disallows Inktomi to spider two pages optimized for Google (gg) and two pages optimized for AltaVista (al). If Inktomi were allowed to spider these pages as well as the pages specifically made for Inktomi, you may run the risk of being banned or penalized. Hence, it’s always a good idea to use a robots.txt file.
The robots.txt file resides on your webspace, but where on your webspace? The root directory! If you upload your file to sub-directories it will not work. If you wanted to disallow all engines from indexing a file, you simply use the * character where the engines name would usually be. However beware that the * character won’t work on the Disallow line.
Here are the names of a few of the big engines:
Excite – ArchitextSpider AltaVista – Scooter Lycos – Lycos_Spider_(T-Rex) Google – Googlebot Alltheweb – FAST-WebCrawle

Be sure to check over the file before uploading it, as you may have made a simple mistake, which could mean your pages are indexed by engines you don’t want to index them, or even worse none of your pages might be indexed.
Another advantage of the Robots.txt file is that by examining it, you can get information on what spiders, or agents have accessed your web pages. This will give you a list of all the host names as well as agent names of the spiders. Moreover, information of very small search engines also gets recorded in the text file. Thus, you know what Search Engines are likely to list your website.

__________________
WinHost Web Hosting

01.21.13

by Saniya

Full Body Text

Most Search Engines scan and index all of the text in a web page. However, some Search Engines ignore certain text known as Stop Words, which is explained below. Apart from this, almost all Search Engines ignore spam.

__________________
WinHost Web Hosting

Ananova

Web Hosting Reviews

Monthly Archives: January 2013

Doorblocks

Submit a “site map” page or a page with links to all inner pages

Size of submitted page

Submit only key pages

Incomnig mails bouncing

Working with Frames

Activate Catch All on Mailbox

Making frames visible to Search Engines

Using Robots.txt to your advantage

Full Body Text

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Connect