{ "@context": "https://schema.org", "@graph": [ { "@type": "Organization", "@id": "https://seotesting.com/home/about/#organization", "url": "https://seotesting.com/home/about/", "name": "SEOTesting", "logo": { "@type": "ImageObject", "@id":"https://seotesting.com/#logo", "url": "https://seotesting.com/images/logo.png" } }, { "@type": "WebSite", "@id": "https://seotesting.com/#website", "url": "https://seotesting.com/", "name": "SEOTesting", "publisher": { "@id": "https://seotesting.com/home/about/#organization" } }, { "@type": "WebPage", "@id": "https://seotesting.com/google-search-console/xml-sitemaps/#webpage", "isPartOf": { "@id": "https://seotesting.com/#website" }, "url": "https://seotesting.com/google-search-console/xml-sitemaps/", "name": "XML Sitemaps - a guide for SEOs", "description": "What are XML sitemaps, why your site should have them, and how are they used by Google to find pages on your site." } ,{ "@type": "BlogPosting", "@id": "https://seotesting.com/google-search-console/xml-sitemaps/#blogposting", "mainEntityOfPage": { "@id": "https://seotesting.com/google-search-console/xml-sitemaps/#webpage" }, "headline": "XML Sitemaps - a guide for SEOs", "description": "What are XML sitemaps, why your site should have them, and how are they used by Google to find pages on your site.", "image": "https://seotesting.com/images/social-cards/xml-sitemaps.jpg", "author": { "@type": "Person", "@id": "https://seotesting.com/author/tiago-silva/#person", "name": "Tiago Silva", "sameAs": ["https://twitter.com/TiagoSilvaHQ","https://www.linkedin.com/in/tiagosilvahq/"], "url": "https://seotesting.com/author/tiago-silva/" }, "publisher": { "@id" : "https://seotesting.com/home/about/#organization" }, "datePublished" : "2022-05-09", "dateModified" : "2024-11-07" } ] }
Written by Tiago Silva. Updated on 07, November 2024
A sitemap is a document with a list of relevant links from a website. This article will focus on XML sitemaps (eXtensible Markup Language), the most popular sitemap protocol used for search engine optimisation.
This article is part of our Google Search Console tutorials and training section. Make sure to check the others out.
An XML sitemap contains a list of all the URLs from a website you want a search engine to crawl regularly. They are a complementary tool used to help search engines find pages and crawl them more efficiently. Consider a sitemap as a roadmap with metadata (like the last updated date) pointing search engines to essential pages.
A single sitemap file must be UTF-8 encoded, have a maximum size of 50,000 URLs, and be 50MB uncompressed, whichever is the highest.
Sitemaps can be compressed to gzip format. These limits prevent servers from becoming overwhelmed. If a Sitemap hits the limitations mentioned, you can create an XML sitemap index.
XML sitemaps can be considered the unsung heroes of the SEO world. They are often overlooked, but they are incredibly vital for a well-optimised website. I have been working on optimising websites for nearly a decade now, and I cannot stress enough how important a comprehensive, up-to-date XML sitemap can be.
Why are XML sitemaps important? Think of them as a roadmap for search engine crawlers like Googlebot. They make it easier for these crawlers to understand the structure of your website. Without a sitemap, crawlers may miss out on some of your newer pages that are more difficult to find.
Another advantage of XML sitemaps is prioritising certain pages over others. For example, if you have an ecommerce website with a vast product catalogue… By carefully structuring the XML sitemap and prioritising high-margin products, you could influence how often those pages are crawled and updated in a search engine’s index.
Whilst it is possible to run a website without having an XML sitemap in place, I would strongly recommend against it. Through years of optimising hundreds of websites, I have found that an XML sitemap is, essentially, an insurance policy for ensuring that search engines fully understand your site’s structure. Especially if you have a large, complex, or rapidly changing site. A sitemap is almost a necessity!
HTML sitemaps and XML sitemaps serve distinct functions and cater to different audiences. Although both are concerned with a website’s navigational structure, an HTML sitemap is a user-facing page that lists important website pages, functioning as a table of contents for human visitors. An XML sitemap is intended only for search engine crawlers, facilitating the efficient indexing of a website’s content.
Image Credit: Elegant Themes
Having both types of sitemaps offers a few advantages. An HTML sitemap enhances user experience as it aids site navigation, which potentially reduces bounce rates and may improve engagement metrics. Whilst these factors may not directly affect search engine rankings, they contribute to a website’s overall performance and health.
On the other hand, an XML sitemap plays a critical technical role by guiding search engine crawlers. Many content management systems automatically generate XML sitemaps, but manual optimisation is recommended if you have the capacity to do this, especially for larger or more complex websites.
XML Sitemap Index is a file containing a list of multiple sitemaps. The limitations are the same: a sitemap Index cannot have more than 50,000 sitemaps and be smaller than 50MB.
It’s also possible to have multiple XML sitemap indexes and compress them using gzip format.
As I wrote above, sitemaps are important but not required for SEO.
Here is what we know about a sitemap’s impact on SEO:
So, while using XML sitemaps won’t guarantee indexing or good rankings, these are the benefits of using them:
You can also check the index coverage reports by sitemap. This allows you to identify crawling issues for specific parts of a site if your sitemaps are separated in such a way.
According to Google, these are the types of websites in more need of a sitemap:
Using the same Google help doc, these are the cases where a website might not need a sitemap:
Note: A sitemap isn’t a replacement for a good internal link structure.
Creating an XML sitemap file is now easier than ever, as most CMSs and website builders have dynamic sitemaps built-in.
Go to the WordPress plugin repository and look for a sitemap plugin. Most SEO plugins have this feature, so I’ll use Rank Math in this example.
Install the plugin, go to the dashboard, and activate the sitemap feature to create an XML sitemap using Rank Math.
You can end the process here or go to “sitemap settings” to customise it with the different content types on the site, like images.
It’s simple to make a sitemap in WordPress this way.
There are many sitemaps generators, like Screaming Frog or SureOak, but I will show you how to build one using SEOTesting’s free XML Sitemap Generator:
Step 1: Head to the XML Sitemap Generator and then click on ‘Click to Generate’ as shown below:
Step 2: Enter your list of URLs and then click ‘Generate’ at the bottom right hand corner:
Step 3: Your sitemap will be created automatically and will automatically download:
Tip: Use a sitemap validator after creating the XML file through a generator like in this example.
A valid XML sitemap must follow the protocol and use the correct schema. In addition, they should have the required attributes and some optional ones.
The sitemaps are primarily made for Robots and search engine crawlers, and here is what one looks like:
The hierarchy of required attributes for a sitemap is the following:
Let’s now explain what each of these attributes is.
The first line in a sitemap is the XML header.
<?xml version=”1.0″ encoding=”utf-8″?>
This header informs the XML standard (1.0 in this example) and character encoding (UTF-8). XML sitemaps must be UTF-8.
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>
The URLset attribute references the current sitemap standard used for the document (0.9 in this example).
Urlset attribute should be used as a pair, one after the header and the other at the end of the doc after all URLs and optional attributes.
The URL tag specifies the URL you want crawlers to use. It’s recommended only to list canonical versions of the links.
The URL is a required attribute in the protocol and is the parent tag for every tag mentioned next in this list (loc, lastmod, changefreq, priority).
<url>
<loc>https://www.moneysavingheroes.co.uk/peacocks</loc>
<lastmod>2020-11-26</lastmod>
</url>
The loc (aka location) is the last of the three mandatory attributes in a sitemap. This tag refers to the location of the URL.
Location tag URL must start with the protocol (i.e., HTTPS or HTTP), end with a trailing slash, and be less than 2,048 characters long.
Lastmod is the URL’s last modification date and must use W3C Datetime format.
Changefreq is the expected frequency of changes likely to happen to a page. Accepted values are always, hourly, daily, weekly, monthly, yearly, and never.
Priority values can be between 0.0 and 1.0, showing search engines how important a page is for the website owner.
An XML sitemap index must have the following elements:
Lastmod is an optional element of the XML sitemap index.
Also, it is important to mention that this file can only mention sitemaps on the same site as the sitemap index, so it won’t be valid for subdomains.
<?xml version=”1.0″ encoding=”UTF-8″?>
<sitemapindex xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>
<sitemap>
<loc>https://www.domain.com/sitemap-pages.xml
</loc>
<lastmod>2022-04-
19T11:54:44.774Z
</lastmod>
</sitemap>
<sitemap>
<loc>https://www.domain.com/sitemap-posts.xml
</loc>
<lastmod>2022-04-25T18:42:55.769Z
</lastmod>
</sitemap>
</sitemapindex>
Now, let’s focus on the best practices for sitemaps:
The last recommendation is to submit the sitemaps in Google Search Console.
This process is straightforward:
Google Search Console accepts both sitemaps and sitemap Indexes.
Google supports the following sitemap formats:
Google pays attention primarily to URLs and lastmod in some situations. Their documentation explicitly says that Google doesn’t consider priority in a sitemap.
On the same documentation page, Google says that they will use the <lastmod> value “if it’s consistently and verifiably (for example, by comparing to the last modification of the page) accurate.”
Further, John Mueller wrote in 2017, “The URL + last modification date is what we care about for web search.”. In 2015, John also said, “Priority and change frequency doesn’t really play that much of a role with sitemaps anymore”.
It’s important to mention that a URL’s position in the sitemap doesn’t matter because Google doesn’t crawl pages in order of appearance.
The most common sitemap location is in the root directory: Domain.com/sitemap.xml.
You can put a sitemap anywhere on the site, but it only affects the descendant directories. This is why putting sitemaps in the root directory is recommended, like in the example above.
Placing a sitemap on a sub-folder like Domain.com/blog/sitemap.xml will only affect URLs at the blog directory (Domain.com/blog/), leaving out all the URLs of the root directory like Domain.com/about or Domain.com/service-1.
Robots.txt files usually mention the sitemap location, helping crawlers find them.
No, you don’t need a sitemap to rank on Google, and they aren’t a ranking factor. But it’s better to use sitemaps because they don’t hurt.
In John Mueller’s opinion, sitemaps are “a minimal baseline for any serious website”.
Google says the following in their documentation: “If your site’s pages are properly linked, Google can usually discover most of your site. Proper linking means that all pages you deem important can be reached through navigation[…]Even so, a sitemap can improve the crawling of larger or more complex sites or more specialised files.”
Using a sitemap is your decision, as they don’t guarantee indexing. But sitemaps help search engines be more efficient with crawl budget. So, even if they aren’t mandatory, they are helpful.
In my SEO career, I have often found that XML sitemaps are the using heroes of website optimisation. They serve as a roadmap to guide search engines through your site, ensuring that crawlers don’t miss on any of your important pages. While they may not directly affect your rankings, they facilitate efficient crawling and indexing, which is crucial for SEO performance.
If you’re running a large, complex, or frequently updated website, not having an XML sitemap is like setting off on a road trip without a map. You are likely to miss some turns!
Given how simple they are to create and maintain, especially with today’s CMS platforms and tools, there’s no goo reason to go without one. So do yourself and your website a favour: Create an XML sitemap, submit it to GSC, and keep it updated.
Want to make better use of your search console data? Find hidden optimisation opportunities? Or make SEO testing easier for your business? Give SEOTesting a try! We’re currently running a 14-day free trial, with no credit card required for sign-up. So sign up today and see how SEOTesting can improve your SEO.