Ignore URLs and query parameters
  • 23 Sep 2024
  • 4 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Ignore URLs and query parameters

  • Dark
    Light
  • PDF

Article summary

Default Behavior

By default, our service is configured to render and cache every request forwarded to it, consistent with our dedication to efficiency and high service standards. To tailor this behavior and exclude specific segments of your website from being prerendered, we offer several options:

Ignore Specific URL Parameters

URL parameters can influence the content displayed on a page. However, not all parameters affect the prerendered content. To ensure the prerendering process captures only essential content, you can configure your setup to ignore specific URL parameters. For example, tracking parameters like utm_source do not alter the content and can be safely excluded:

https://example.com/path?utm_source=newsletter&utm_medium=email

By default Prerender will ignore a small list of URL parameters such as:

ParameterCommonly Used ByRecommended Behavior
utm_mediumGoogle AnalyticsIgnore when caching
utm_sourceGoogle AnalyticsIgnore when caching
utm_campaignGoogle AdwordsIgnore when caching
utm_contentGoogle AdwordsIgnore when caching
gclidGoogle AdwordsIgnore when caching
fbclidFacebook / PixelIgnore when caching
utm_termGoogle AnalyticsIgnore when caching
Members with older accounts!

Every account registered after January 2022 has a set of common URL parameters configured to be ignored but accounts created before this date will have to add these parameters manually.

This means if our service receives a URL that looks like this:https://example.com/path?utm_source=newsletter&utm_medium=email, Then Prerender will cut off the parameter and its value and serve the page on that URL to the bot returning: https://example.com/path
This way Prerender only need to cache one page instead of one URL for each utm_medium.

You can add further parameters to this list from your dashboard from the Cache Manager menu under the Url Parameters. You can also choose to ignore all URL parameters (Ignore all query parameters) or Ignore all URL parameters with exceptions (Only cache specific query parameters).

image.png

How to Create New Parameter Ignore Rule

Before we begin

Setting up a new Ignore rule will not automatically remove existing URLs that match it from your cache, so you need to manually remove them by clearing your cache via your Dashboard.

  1. Navigate to the Cache Manager menu and click on Url Parameters at the top.

  2. Click the Add Parameter button.
    chrome_fyrwNL9dUV.png

  3. Fill in the parameter's string (value)
    image.png

Wildcard filter

You can add wildcard patterns as the filter. And you can also check ift he pattern matches the URL parameter you wish to ignore. The number of affected URLs will be shown under the box.

e.g.: fb* will match fb_action_ids
image.png

  1. Verify in the list
    chrome_SRtsf12ivN.png
Please be aware

Configuration changes are not instantly applied. It may take up to 59 minutes before the newly added parameter rule is applied to your environment.

URLs will be removed

The URLs containing the ignored URL parameters will be removed from the cache automatically in 2 hours after adding the filter.

To rewrite the provided markdown text based on the Prerender_TermBase.md file, I'll incorporate the terminology and principles from the Termbase while maintaining the structure and content integrity of the original markdown. This will ensure the revised text aligns with the standardized communication and efficiency goals outlined in the Termbase.

Ignore and Respond 404

In instances where it's beneficial to exclude certain URLs from search engine results—perhaps due to inadvertently providing a search engine with a non-useful page—we offer a solution. Our URL rule matching system is designed to issue an HTTP 404 response for requests directed to specific URLs, effectively removing them from search engine indexes.

image.png

Contain Match Example

This match type scrutinizes the entire URL, encompassing the domain, path, and query parameters, for the specified value.

Contain Match in URL

It also extends to parameter values:

Matching in Parameter Values

Utilize the rule tester to confirm the exclusivity of your matches, preventing unintended page matches.

Rule Tester Utility

Wildcard Match Example

Wildcards utilize * to create flexible pattern matches within URLs.

Wildcard Match Configuration

Here are examples of wildcard rules and their intended effects:

PatternEffect
*xyzxyz*Excludes all URLs containing xyzxyz.
http://*Excludes all URLs starting with http://, useful for omitting non-secure pages.
*.aspxExcludes all URLs ending with .aspx.
https://example.com/*Excludes all URLs beginning with https://example.com/, helpful for filtering specific domains.

Note: Begin your pattern with a * to avoid matching the start of the URL. For instance, example.com/* won't exclude https://example.com/something as the rule doesn't commence with *.

Regular Expressions

For scenarios where the above options do not suffice, we can implement custom rules based on regular expressions. Please contact our support team for assistance with such configurations.

Configure The Integration

To optimally dictate what content is prerendered, configuring your Integration is the most effective strategy, guiding only relevant requests to search engines or social platforms. Further details are available in our Integration documentation.

For those seeking to adjust SEO strategies without code deployments, the aforementioned solutions serve as practical alternatives.

Robots.txt for Good Bots Only!

We advise configuring your robots.txt file to guide compliant search engine crawlers. While this does not guarantee avoidance by all robots, it ensures that reputable bots, like GoogleBot, adhere to the directives specified in robots.txt.

It's important to note that our system does not interact with robots.txt files. Nevertheless, proper configuration is recommended for optimal SEO outcomes. For comprehensive insights, refer to the Google Search Central guidelines.

Render Counter

URLs matching ignore rules are not counted against the render counter. This feature should ideally serve as a last resort for excluding specific URLs or patterns. Excessive usage may prompt our team to suggest pre-service exclusion of certain URLs.


Was this article helpful?