How to Use Prerender's Sitemap Crawler
  • 30 Nov 2023
  • 3 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

How to Use Prerender's Sitemap Crawler

  • Dark
    Light
  • PDF

Article Summary

Context

A sitemap tells the search engines what pages of your site are essential in your opinion and it also provides them with additional helpful information such as when each page was updated.
In other words, a sitemap helps search bots discover new URLs and it is very useful, especially if you have a big site with numerous pages or if your site is new and only has a couple of external links. 

And uploading it to your Prerender account you can also utilize Prerender better and improve your SEO even more. 

What does the Sitemap crawler do?

Prerender's Sitemap crawler can discover sitemaps automatically. However, in case it did not happen automatically you can also add your sitemaps to this tool so that Prerender can add the URLs from the map to the Manual rendering queue and cache the pages. So that bots will get a faster response when inevitably requesting the URLs. 

The Sitemap crawler can process multiple sitemaps. It will check the sitemaps every 7 days, every day, or every hour, depending on your Prerender plan. If there are any new URLs added to the sitemap, then Prerender will add the new rendering queue, and it will be cached.

How to use it?

To use it you just need to open the Sitemaps menu on your Prerender dashboard and then click the Import Sitemap button at the top right. Paste the sitemap URL into the text box that pops up after. When you initiate the first crawl, you have the option to select whether to cache the pages from the sitemap for desktop and/or mobile and you can also select the recrawl interval for the sitemap.
The same settings will take effect on the next sitemap crawl as well. 

You can change the crawling settings on a per-sitemap basis. To do that, click on the eye icon on the right end of the row. then click on the Settings button at the top right corner.
Here you can change the revisiting interval for the Sitemap and also the rendering devices.

Sitemaps can be manually recrawled from the Dashboard or using the /sitemap API endpoint as well.
The latter can be useful if you add it to your workflow, so when the Sitemap is updated, it will be recrawled by Prerender automatically.

Process

  1. Prerender.io downloads the site map from the server using the Prerender user agent.
  2. Adds < url >...< /url > URLs to the cache (if the URL is already in the cache, it skips this step).
  3. Downloads and processes the additional < sitemap>...</ sitemap> sitemaps.
Supported format
Prerender.io supports the standard sitemap XML format, described in sitemaps.org .


Provided information

You can always check if your sitemaps are healthy and available for the Prerender sitemap crawler. If the Sitemap is available and can be crawled by our crawler then you will see a ✅ icon under the Health column.  When initially adding it to the Sitemap menu you will see a ❔icon under the health column. It might take a couple of minutes for the icon to change.



The Sitemap table will show you if the sitemap is not available or cannot be crawled. At the moment it won't provide information or error messages on what the issue might be.


Possible reasons why the sitemap might be flagged as Unhealthy:

  • The Sitemap URL is incorrect
  • Our crawler might be blocked by a firewall
  • The Sitemap does not follow the template of https://sitemaps.org/protocol.html
  • One or multiple of the URLs on the Sitemap throws a non-200 HTTP status.
         For example, 404, one of the URLs does not exist. 

You can also check when a sitemap was crawled the last time (Last visited at column) and when it will be crawled again (Next visit column). 

Clicking on the 👁️ eye icon you can check what the results of each sitemap crawl were. 


 The URLs that were already detected previously on the Sitemap won't be recached when the sitemap is re-crawled they will be skipped.


 


Was this article helpful?