What are Sitemaps and How to Create an XML Sitemap File

general, web development 11584 Comments

Web definition of sitemaps

A sitemap is a list of pages used by webmasters to inform search engines about URLs on a website that are available for crawling. The structure of site map file is generally in XML format and the contents must be UTF-8 encoded. It allows webmasters to include additional information about each URL like when it was last updated, how often it changes etc. Additionally, there are bit more valuable information that can be used to fine tune the crawling behavior.

Important Information about sitemaps

  • Last modified time - It includes the date and time when the file was last modified.
  • Frequency of modification - This allows defining how often the contents of a url are modified.
  • Priority level of a page - The priority level allows indicating if a page is more or less important than other pages in the sitemap.

How To Build a Sitemap

The Sitemap Protocol format consists of XML tags that are quite easy to build. The minimum required field is the URL and all additional information is optional.

  • The Head Part - As this is the starting section of an XML document so it must needs a declaration as follows to qualify as a valid XML file.
    <?xml version="1.0" encoding="UTF-8"?>
  • The Container Element - The container element in a sitemap is the <urlset> element and is used as below.
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    </urlset>
  • The URL - It holds the all required information required for a single url and should be placed inside the url container element.
    <url>
    </url>
  • The location - This is the minimum required element in the URL element and it contains the full URL of the page, including the protocol (e.g. http, https) and a trailing slash, if required by the site's hosting server. This value must be less than 2,048 characters.
    <loc>http://www.yoursite.com/folder/file.php</loc>
  • The Modified Date - Its holds the date when the file was last modified, in ISO 8601 format. This can display the full date and time or, if desired, may simply be the date in the format YYYY-MM-DD.
    <lastmod>2009-03-18</lastmod>
  • Modification Frequency - This indicates how frequently the page may change and following are the valid values fro this tag.
    • always
    • hourly
    • daily
    • weekly
    • monthly
    • yearly
    • never
    This is used only as a guide for crawlers, and is not used to determine how frequently pages are indexed.
    <changefreq>daily</changefreq>
  • The page priority - The priority of that URL relative to other URLs on the site. This allows webmasters to suggest to crawlers which pages are considered more important. The valid range is from 0.0 to 1.0, with 1.0 being the most important. The default value is 0.5.
    Rating all pages on a site with a high priority does not affect search listings, as it is only used to suggest to the crawlers how important pages in the site are to one another.
    <priority>0.8</priority>

A Sample Sitemap

A simple sitemap may look like as

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.yoursite.com/folder/file.php</loc>
</url>
</urlset>

A complex sitemap may look like as

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.yoursite.com/folder/file.php</loc>
<lastmod>2009-03-18</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
</urlset>

Limitations of Sitemap

Sitemap files have a limit of 50,000 URLs and 10MB size per sitemap. To reduce the bandwidth consumption you can compress sitemap files using gzip compression algorithms. In case of handling very large number or urls, multiple sitemap files are supported, with a sitemap index file serving as an entry point for a total of 1000 sitemaps.

Submission to Search Engines

The following are the sitemap submission URLs for several major search engines.

Google: http://www.google.com/webmasters/tools/ping?sitemap=
Yahoo: http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=
Bing: http://www.bing.com/webmaster/ping.aspx?siteMap=
Live: http://webmaster.live.com/ping.aspx?siteMap=
Ask: http://submissions.ask.com/ping?sitemap=

elegant themes banner

Related Articles:

You may be interested in:

Mike Smith writes for WebToolHub.com. He loves to golf, cook and explore music in his free time.

Would you like to contribute to this site? Get started ยป
Rate this article:
(4.2 rating from 10 votes)

Comments