A Site Map or Sitemap is file that contains a list of pages of the website to inform to search engines about the organization of the content of the site together with some additional information, such as how often the page changes its contents, when it was last updated and how important it is to the rest of the pages of the site site.
A Site Map is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for Web design, or a Web page that lists the pages on a website, typically organized in hierarchical fashion.
In the web design, the positioning in the search engines of the sites has gained an enormous importance. One measure of this is the amount of time and effort we devote to optimizing sites first and getting good inbound links later.
A basic requirement for all that effort is fruitful is that all pages of the site are properly indexed in search engines. This will happen as long as the crawlers of important search engines visit and index those pages with adequate frequency and without omitting any.
Even without resorting to any extra procedure, spiders relay web pages simply by following the links (unless they include the "no-follow" attribute), but there are more things that can be done. A good strategy is to provide search engines with a "list" of the pages we are interested in indexing, along with some additional information that makes the visit more effective. Specifically, this is achieved through the use of Sitemaps.
Sitemaps make relationships between pages and other content components. It shows shape of information space in overview. Sitemaps can demonstrate organization, navigation, and labeling system.
Types of Site Maps
There are two popular versions of a site map.
XML Sitemap: is a structured format that a user doesn't need to see, but it tells the search engine about the pages in a site, their relative importance to each other, and how often they are updated.
HTML Sitemaps: designed for the user to help them find content on the page, and don't need to include each and every subpage. This helps visitors and search engine bots find pages on the site. HTML sitemaps are not supported in Google Webmaster Tools.
While some developers argue that site index is a more appropriately used term to relay page function, web visitors are used to seeing each term and generally associate both as one and the same. However, a site index is often used to mean an A–Z index that provides alphabetically-organized access to particular content, while a site map provides a general top-down view of the overall site contents organized with a classification system.
They also act as a navigation aid by providing an overview of a site's content at a single glance.
XML Site Maps
Google introduced the Sitemaps protocol so web developers can publish lists of links from across their sites. The basic premise is that some sites have a large number of dynamic pages that are only available through the use of forms and user entries. The Sitemap files contains URLs to these pages so that web crawlers can find them. Bing, Google, Yahoo and Ask now jointly support the Sitemaps protocol.
Since Bing, Yahoo, Ask, and Google use the same protocol, having a Sitemap lets the four biggest search engines have the updated page information. Sitemaps do not guarantee all links will be crawled, and being crawled does not guarantee indexing. However, a Sitemap is still the best insurance for getting a search engine to learn about your entire site. Google Webmaster Tools allow a website owner to upload a sitemap that Google will crawl, or they can accomplish the same thing with the robots.txt file. On 1st Dec 2016 both Bing and Google confirmed that they now support xml sitemaps up to 50MB in size while uncompressed.
XML Sitemaps have replaced the older method of "submitting to search engines" by filling out a form on the search engine's submission page. Now web developers submit a Sitemap directly, or wait for search engines to find it. Regularly submitting an updated sitemap when new pages are published allows search engines to find and index those pages more quickly than it would by finding the pages on its own.
XML (Extensible Markup Language) is much more precise than HTML coding. Errors are not tolerated, and so syntax must be exact. It is advised to use an XML syntax validator such as the free one found at: http://validator.w3.org
There are automated XML site map generators available (both as software and web applications) for more complex sites.
More information defining the field operations and other Sitemap options are defined at http://www.sitemaps.org (Sitemaps.org: Google, Inc., Yahoo, Inc., and Microsoft Corporation).
Benefits of XML sitemaps to search-optimize Flash sites
Below is an example of a validated XML sitemap for a simple three page web site. Sitemaps are a useful tool for making sites built in Flash and other non-html languages searchable. If a website's navigation is built with Flash, an automated search program would probably only find the initial homepage; subsequent pages are unlikely to be found without an XML sitemap.
The construction of the XML file must follow a series of guidelines specified in the sitemaps protocol, which we describe as follows:
Required Sitemap Contents
The Sitemap protocol is built based on XML tags (tags) included in a file with UTF-8 encoding.
Data values (as opposed to tags themselves) should use escape codes for certain special characters, as is customary in HTML.
For example, the double quotation marks (") must be replaced by & quot; and the smaller (<) and greater (>) signs by & lt; and & gt; respectively.
The XML file must:
Start with a <urlset> opening tag and end with a close tag </ urlset>
Specify the standard protocol to which it responds within the <urlset> opening tag (see example)
Include a <url> entry for each URL (which will correspond to each page of the site) as the parent XML node.
Include an <XML> child XML node for each URL (each parent XML node <url>).
Summarizing these aspects in an example (site with two pages):
<? xml version = "1.0" encoding = "UTF-8"?>
<urlset xmlns = "http://www.sitemaps.org/schemas/sitemap/0.9">
<loc> http://www.misitio.com/ </ loc>
<loc> http://www.mysite.com/contact.htm </ loc>