NGINX:仅将搜索机器人重定向到给定文件

NGINX:仅将搜索机器人重定向到给定文件

我尝试将我以前的 .htaccess(与 apache 一起使用)移植到 nginx:

<IfModule rewrite_module>   
        RewriteEngine on
        RewriteCond "%{HTTP_USER_AGENT}" "(Googlebot|bingbot|slackbot|vkShare|W3C_Validator)" [NC]
        RewriteRule .* bot.php [L]
        RewriteBase /
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        RewriteRule .* index.html [L]
    </IfModule>

以下是我目前正在尝试的方法:

我已经生成了一个作为搜索引擎列表的地图:

map $http_user_agent $search_engines {
default 0;
"~bingbot.*" 1;
"~BingPreview.*" 1;
"~Googlebot.*" 1;
}
if ($search_engines = 1){
   rewrite ^/(.*) bot.php?$1 break;
} 

但这会形成一个无限循环。

以下是完整的服务器块:

server {
    server_name mypage.de www.mypage.de;
    listen 1.1.1.1;
    root /home/mypage/public_html;
    index index.html index.htm index.php;
    access_log /var/log/virtualmin/mypage.de_access_log;
    error_log /var/log/virtualmin/mypage.de_error_log;
        
    fastcgi_param GATEWAY_INTERFACE CGI/1.1;
    fastcgi_param SERVER_SOFTWARE nginx;
    fastcgi_param QUERY_STRING $query_string;
    fastcgi_param REQUEST_METHOD $request_method;
    fastcgi_param CONTENT_TYPE $content_type;
    fastcgi_param CONTENT_LENGTH $content_length;
    fastcgi_param SCRIPT_FILENAME /home/mypage/public_html$fastcgi_script_name;
    fastcgi_param SCRIPT_NAME $fastcgi_script_name;
    fastcgi_param REQUEST_URI $request_uri;
    fastcgi_param DOCUMENT_URI $document_uri;
    fastcgi_param DOCUMENT_ROOT /home/mypage/public_html;
    fastcgi_param SERVER_PROTOCOL $server_protocol;
    fastcgi_param REMOTE_ADDR $remote_addr;
    fastcgi_param REMOTE_PORT $remote_port;
    fastcgi_param SERVER_ADDR $server_addr;
    fastcgi_param SERVER_PORT $server_port;
    fastcgi_param SERVER_NAME $server_name;
    fastcgi_param PATH_INFO $fastcgi_path_info;
    fastcgi_param HTTPS $https;
    location ~ \.php(/|$) {
        try_files $uri =404;
        fastcgi_pass unix:/var/php-nginx/123123123123123123.sock/socket;
    }
    fastcgi_split_path_info ^(.+\.php)(/.+)$;
    listen 1.1.1.1:443 ssl;
    ssl_certificate /home/mypage/ssl.combined;
    ssl_certificate_key /home/mypage/ssl.key;
    
    if ($blocked_bots = 1) {
        return 444; # Connection closed without response
    }
    if ($search_engines = 1){
        rewrite ^/(.*) /bot.php?$1 break;
    } 
    
    if ($scheme = http) {
        rewrite ^/(?!.well-known)(.*) https://mypage/$1 break;
    }
    
    location / {
        try_files $uri /index.html;
        auth_basic "Administrator’s Area";
        auth_basic_user_file /home/mypage/.htpasswd;
    }
    
    # Cache-Controll
    include /etc/nginx/conf.d/manuallyInclude/cache-policy.conf;
}

第二个问题: 我还有另一个用于 social_network 机器人的映射变量。我是否真的需要为每个映射添加一个自己的 if 子句,如下所示:

    if ($search_engines = 1){
        rewrite ^/(.*) /bot.php?$1 break;
    } 
    if ($social_networks = 1){
        rewrite ^/(.*) /bot.php?$1 break;
    } 

或者是否有更简单的方法将它们组合成一个重写规则?

答案1

回答你的第二个问题,在你的翻译中使用空字符串而不是“0” map(因为默认map值正好是一个空字符串,所以你可以default完全省略该行):

map $http_user_agent $search_engines {
    "~bingbot" 1;
    "~BingPreview" 1;
    "~Googlebot" 1;
}
map $http_user_agent $social_networks {
    "~*facebook" 1;
    "~*twitter" 1;
}

并使用变量连接来做出最终的条件决策:

map $search_engines$social_networks $is_bot {
    ""      "";
    default 1;
}

server {
    ...
    if ($is_bot) {
        rewrite ^/(.*) /bot.php?$1 break;
    } 

相关内容