≡ Menu

How to create and add sitemaps to Google and Bing?

Sitemaps are lists of URLs that exist on your website. Google will eventually discover all pages on your sites that are linked to by other pages. But sitemaps are a way to suggest to Google and Bing, pages that they might not discover by crawling other pages on the web. Sitemaps also make it faster for all of your pages (even if it is a large number) to get crawled. I had a site that had over a million URLs and Google crawled all of those pages in a month! Obviously I had submitted sitemaps.

A sitemap can be a text file that contains URLs from your site or it could be an XML file that contains URLs in some structured markup. I will talk about both formats in this tutorial.

The text sitemap just contains a text list of URLs, each on a different line. A sitemap can contain a maximum of 50,000 URLs. So if your site has more than 50,000 pages, just create more than one sitemap.

There is no reachable limit to the number of sitemaps that you can submit. However, note that if your site has several million URLs, it is advisable to submit only the most important URLs in the sitemap so that the possibility of getting all of your URLs (in the sitemaps) crawled increases. Google and Bing don’t crawl all pages in a sitemap.

The XML sitemap looks like the following:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" 
  xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" 
  xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
  <url> 
    <loc>http://www.example.com/foo.html</loc> 
    <image:image>
       <image:loc>...</image:loc> 
    </image:image>
    <video:video>     
      <video:content_loc>...</video:content_loc>
    </video:video>
  </url>
</urlset>

The most important tags in the sitemap are <url> and <loc>. The <image> and <video> tags are optional. If a page is found by Google or Bing, there is a very high possibility that all images and videos on the page will be crawled.

Here is an article by Google that explains in detail how to create sitemaps.

Again, for both the text and XML sitemaps, you can submit a maximum of 50,000 pages per sitemap. For larger sites, you should use multiple sitemaps. A sitemap cannot be larger than 50MB in size. A sitemap should not contain session IDs. A sitemap should refer to pages consistently. So, if you begin your site with a www, make sure that all pages of your site in the sitemap begin with a www. Also, you need not make multiple entries for your pages with both www and non-www versions. Just go to Google webmasters’ tools and ask Google to prefer the www or non-www version depending on your preferences. A sitemap should be UTF-8 encoded. In most cases, you don’t have to worry about the encoding.

If you have multiple sitemaps, you need not submit all sitemaps to Google, though that is an option if you have less than 5 sitemaps. But for more, you should create a sitemap index file. This is an XML file with the following format:

<?xml version="1.0" encoding="UTF-8"?>
   <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://www.yourdomain.com/sitemap1.xml</loc>
      <lastmod>2013-04-15</lastmod>
   </sitemap>
   <sitemap>
      <loc>http://www.yourdomain.com/sitemap2.txt</loc>
      <lastmod>2013-04-15</lastmod>
   </sitemap>
   </sitemapindex>

This format contains all of your sitemaps in a list that you submit to Google. The top element is the <sitemapindex>. Each sitemap is in a <sitemap> tag and the <loc> contains the location of the sitemap. The <lastmod> instructs Google to recrawl the sitemap if the date is after the date Google crawled the sitemap the previous time.

To add these sitemaps to Google, just visit Google Webmasters’ tools. For Bing there is Bing Webmasters’ Tools.

A sitemap is a very quick way to get your site noticed by Google or Bing. Both search engines crawl pages significantly faster if submitted through sitemaps. Like I said before, because of the sitemaps I submitted, Google crawled the entirety of my site (one million pages) in a month!

Rahul

Comments on this entry are closed.