What is a sitemap?
Sitemaps are important for your website's technical SEO. Google will use your sitemap to discover all the web pages on your site and get the content indexed.
If your site is new then this is crucial! Google will not find a page until there is an external link to it.
To help Google and others find your web pages you can create a sitemap. This is an XML file that lists the URL structure of your blog or site.
We will look at what is a sitemap, why you need a sitemap, how to generate one and submit it to search engines.
There is a lot to cover so let's get started.
What is a sitemap file?
A sitemap is a file that lists all the pages on your website.
The sitemap is usually an XML file found at the root of your website. For example, here is the sitemap URL for PageDart:
https://pagedart.com/sitemap.xml
The sitemap should contain only the pages that you want to appear in the search engines. As Google, Bing, etc use this file to find all the pages on your site.
The XML has a very specific format that you must follow. It is hard to do this by hand so it is best to get the files autogenerated from the website software you are using.
This is what it looks like:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://pagedart.com/blog/</loc>
<lastmod>2020-01-28T10:58:08-04:00</lastmod>
</url>
...
</urlset>
A bit scary to fix this by hand!
If you are using WordPress then tools such as Yoast, Rank Math and SEOPress can generate the sitemap for you.
If you are using a static site generator like Hugo then it will generate a sitemap automatically for you.
Why do you need a sitemap?
The idea behind the sitemap is to make it easier for web crawlers such as Googlebot to find all the pages on your site. If a site does not have a sitemap then Googlebot has to:
- visit each page and find links
- then go and visit the next page and find more links
- and repeat
This can take a long time to discover them all and a sitemap can speed up this discovery.
As the sitemap is a list of all the pages we can give this file to Googlebot so that it knows which pages to visit.
Here are 3 reasons why you need to have a sitemap:
1) Your site does not have good internal links
If your site does not have many internal links to other pages you have created, a sitemap can help. A page without any links is an orphaned page.
Google will never find an orphaned page.
To fix this you can list all your pages in a sitemap.
2) Your site has a lot of pages
The sitemap can include the updated date of all your pages. When you have a site with many pages you can let Google know which pages have been recently updated. This will help Google find the pages they need to crawl next.
3) The site is new
If you have a new site you may not have any external links to the pages on your site. In this case, a sitemap can help speed up the discovery of the pages.
How do I find my sitemap?
If you are lucky you will find your sitemap at the root:
https://pagedart.com/sitemap.xml
Yet, the file does not have to be here and can be anywhere on your site. If yours is not at the root then how can you find the sitemap?
One way to find it is to use the robots.txt file. This is a file used by bots like Googlebot that crawl your website. This file is also found in the root of the website here is the robots.txt for PageDart:
https://pagedart.com/robots.txt
If you look at this file you can see that there are a few lines. One of these lines is the sitemap link:
User-agent: *
Sitemap: https://pagedart.com/sitemap.xml
You may find that the robots.txt has more than one sitemap. This is also possible for example here is the robots.txt from Walmart:
#Sitemaps-https
Sitemap: https://www.walmart.com/sitemap_topic.xml
Sitemap: https://www.walmart.com/sitemap_browse.xml
Sitemap: https://www.walmart.com/sitemap_category.xml
Sitemap: https://www.walmart.com/sitemap_store_main.xml
Sitemap: https://www.walmart.com/sitemap_ip.xml
This is quite common when you have a large website because a sitemap should not contain more than 50,000 URLs.
If your robots.txt file does not have any sitemaps listed then you can try some of these common locations:
- /sitemap.xml
- /sitemap_index.xml
- /sitemap-index.xml
- /sitemap/
- /post-sitemap.xml
- /sitemap/sitemap.xml
- /sitemap/index.xml
- /rss/
- /rss.xml
- /sitemapindex.xml
- /sitemap.xml.gz
- /sitemap_index.xml.gz
- /sitemap.php
- /sitemap.txt
- /atom.xml
Which pages should be in your XML sitemap?
The sitemap should only contain the pages that you want to appear in a search engine. This is because you can submit the sitemap to both Google and Bing.
Let's look at how we can submit your sitemap to Google and Bing.
How do I submit my sitemap to Google?
To submit your sitemap to Google you need to upload your sitemap to the Google Search Console.
Once you have verified that you are the owner of the site you can submit your sitemap. To do this select the “sitemaps” section from the menu:
There is a section to add a new sitemap that looks like this:
Enter the full URL of the sitemap into this box and Google will go and read the file.
If everything has worked you will see a success screen like this:
Remember to add all your sitemaps if you have more than one.
Let's do the same for Bing.
How do I submit my sitemap to Bing?
To submit your sitemap to Bing you need to use the Bing Webmaster Tools.
Once you have set up Google Search Console it is easy to add Bing. Bing has a tool that can read some of the data in Google Search Console and automatically set up the site in Bing.
This is the fastest way to set it up and once you do the sitemap will be automatically imported.
If you have trouble with this import then you can follow our guide on verifying your domain with Bing.
Sitemaps and Languages
You can also use a sitemap to list all the alternate language content on your site. For example, if you have English, French and German content your sitemap can contain a link to each.
To do this we must use something called hreflang it would look like this:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://example.com/en</loc>
<xhtml:link
rel="alternate"
hreflang="de"
href="https://www.example.com/de"/>
<xhtml:link
rel="alternate"
hreflang="fr"
href="https://www.example.com/fr"/>
<xhtml:link
rel="alternate"
hreflang="en"
href="https://example.com/en"/>
</url>
<url>
<loc>https://www.example.com/de</loc>
<xhtml:link
rel="alternate"
hreflang="de"
href="https://www.example.com/de"/>
<xhtml:link
rel="alternate"
hreflang="fr"
href="https://www.example.com/fr"/>
<xhtml:link
rel="alternate"
hreflang="en"
href="https://www.example.com/en"/>
</url>
<url>
<loc>https://www.example.com/fr</loc>
<xhtml:link
rel="alternate"
hreflang="de"
href="https://www.example.com/de"/>
<xhtml:link
rel="alternate"
hreflang="fr"
href="https://www.example.com/fr"/>
<xhtml:link
rel="alternate"
hreflang="en"
href="https://www.example.com/en"/>
</url>
</urlset>
Again you don't want to do this by hand and should use your website software to update it. For Hugo, it has multilingual support by default, so there is nothing special to configure. For WordPress, you will need to install the WPML plugin. Then Yoast or Rank Math will be able to output the hreflang in your sitemap.
It is easier to manage your language version is a single file and you can use the sitemap to do it.
Wrapping Up, What is a sitemap?
We have covered what is a sitemap. It is a file that lists all the pages that you want the search engines to show in the search results.
You need a sitemap when you:
- Have a new site with few external links
- A site with poor internal links and orphaned pages
- A large site with many pages
This covers most websites! So make sure that you set up a sitemap using an automated process. For WordPress look at the Yoast, SEOPress or Rank Math plugins.
If you are using a static site generator then the framework will generate one for you as Hugo does.
It may be that your site already has a sitemap in this case you need to find it. Follow the guide to locate your sitemap and once you do make sure to submit it to both Google and Bing.