- Print
- DarkLight
- PDF
Ignore URLs and query parameters
Default Behavior
By default, our service is configured to render and cache every request forwarded to it, consistent with our dedication to efficiency and high service standards. To tailor this behavior and exclude specific segments of your website from being prerendered, we offer several options:
- Ignoring URL parameters
- Ignoring URLs based on predefined rules
- Configuring your Integration to not forward requests to our prerendering service
Ignore Specific URL Parameters
URL parameters can influence the content displayed on a page. However, not all parameters affect the prerendered content. To ensure the prerendering process captures only essential content, you can configure your setup to ignore specific URL parameters. For example, tracking parameters like utm_source
do not alter the content and can be safely excluded:
https://example.com/path?utm_source=newsletter&utm_medium=email
By default Prerender will ignore a small list of URL parameters such as:
Parameter | Commonly Used By | Recommended Behavior |
---|---|---|
utm_medium | Google Analytics | Ignore when caching |
utm_source | Google Analytics | Ignore when caching |
utm_campaign | Google Adwords | Ignore when caching |
utm_content | Google Adwords | Ignore when caching |
gclid | Google Adwords | Ignore when caching |
fbclid | Facebook / Pixel | Ignore when caching |
utm_term | Google Analytics | Ignore when caching |
Every account registered after January 2022 has a set of common URL parameters configured to be ignored but accounts created before this date will have to add these parameters manually.
This means if our service receives a URL that looks like this:https://example.com/path?utm_source=newsletter&utm_medium=email
, Then Prerender will cut off the parameter and its value and serve the page on that URL to the bot returning: https://example.com/path
This way Prerender only need to cache one page instead of one URL for each utm_medium.
You can add further parameters to this list from your dashboard from the Cache Manager menu under the Url Parameters. You can also choose to ignore all URL parameters (Ignore all query parameters) or Ignore all URL parameters with exceptions (Only cache specific query parameters).
How to Create New Parameter Ignore Rule
Setting up a new Ignore rule will not automatically remove existing URLs that match it from your cache, so you need to manually remove them by clearing your cache via your Dashboard.
Navigate to the Cache Manager menu and click on Url Parameters at the top.
Click the Add Parameter button.
Fill in the parameter's string (value)
You can add wildcard patterns as the filter. And you can also check ift he pattern matches the URL parameter you wish to ignore. The number of affected URLs will be shown under the box.
e.g.: fb* will match fb_action_ids
- Verify in the list
Configuration changes are not instantly applied. It may take up to 59 minutes before the newly added parameter rule is applied to your environment.
The URLs containing the ignored URL parameters will be removed from the cache automatically in 2 hours after adding the filter.
To rewrite the provided markdown text based on the Prerender_TermBase.md
file, I'll incorporate the terminology and principles from the Termbase while maintaining the structure and content integrity of the original markdown. This will ensure the revised text aligns with the standardized communication and efficiency goals outlined in the Termbase.
Ignore and Respond 404
In instances where it's beneficial to exclude certain URLs from search engine results—perhaps due to inadvertently providing a search engine with a non-useful page—we offer a solution. Our URL rule matching system is designed to issue an HTTP 404 response for requests directed to specific URLs, effectively removing them from search engine indexes.
Contain Match Example
This match type scrutinizes the entire URL, encompassing the domain, path, and query parameters, for the specified value.
It also extends to parameter values:
Utilize the rule tester to confirm the exclusivity of your matches, preventing unintended page matches.
Wildcard Match Example
Wildcards utilize *
to create flexible pattern matches within URLs.
Here are examples of wildcard rules and their intended effects:
Pattern | Effect |
---|---|
*xyzxyz* | Excludes all URLs containing xyzxyz . |
http://* | Excludes all URLs starting with http:// , useful for omitting non-secure pages. |
*.aspx | Excludes all URLs ending with .aspx . |
https://example.com/* | Excludes all URLs beginning with https://example.com/ , helpful for filtering specific domains. |
Note: Begin your pattern with a *
to avoid matching the start of the URL. For instance, example.com/*
won't exclude https://example.com/something
as the rule doesn't commence with *
.
Regular Expressions
For scenarios where the above options do not suffice, we can implement custom rules based on regular expressions. Please contact our support team for assistance with such configurations.
Configure The Integration
To optimally dictate what content is prerendered, configuring your Integration is the most effective strategy, guiding only relevant requests to search engines or social platforms. Further details are available in our Integration documentation.
For those seeking to adjust SEO strategies without code deployments, the aforementioned solutions serve as practical alternatives.
Robots.txt for Good Bots Only!
We advise configuring your robots.txt
file to guide compliant search engine crawlers. While this does not guarantee avoidance by all robots, it ensures that reputable bots, like GoogleBot, adhere to the directives specified in robots.txt
.
It's important to note that our system does not interact with robots.txt
files. Nevertheless, proper configuration is recommended for optimal SEO outcomes. For comprehensive insights, refer to the Google Search Central guidelines.
Render Counter
URLs matching ignore rules are not counted against the render counter. This feature should ideally serve as a last resort for excluding specific URLs or patterns. Excessive usage may prompt our team to suggest pre-service exclusion of certain URLs.