google_robots_allow

The google_robots_allow directive allows or disallows specific user agents from accessing the Google mirror site.

Syntaxgoogle_robots_allow on | off;
Defaultoff
Contextlocation
Arguments1

Description

The google_robots_allow directive is used to control access for various user agents—specifically, it determines whether certain bots, including Google's crawlers, are permitted to access the resources provided by the Google mirror created using the ngx_http_google_filter_module. The directive takes one argument, which is expected to be a boolean value indicating whether to allow (on) or disallow (off) access to the specified user agents. This can be particularly useful for webmasters who wish to optimize how search engines index their content in this mirrored format.

When set to 'on', the Google bots are allowed to crawl and index the pages of the mirror site. Conversely, if set to 'off', access will be denied to these search engine crawlers, effectively preventing them from indexing the content. This behavior is determined during request processing, whereby the server checks the configuration against incoming requests from known user agents. This allows for greater control over how the mirror site interacts with search engines, thus influencing SEO outcomes, while also aligning with the strategies for content distribution over networked platforms.

Config Example

location / {
    google on;
    google_robots_allow on;
}

Not configuring this directive could lead to undesired indexing by search engines, affecting SEO.

Setting this directive to 'off' inadvertently when you want to allow crawling can hinder site visibility in search results.

← Back to all directives