奇怪的 apache mod_rewrite 行为

奇怪的 apache mod_rewrite 行为

我是这个网站的新手,希望您能帮助我解决安装 Rhodecode 时遇到的问题。故事(很长):
我成功地在 Linux 机器的虚拟环境中安装了 Rhodecode。使用开发服务器 ( paster serve production.ini),我发现它运行正常。但是,我想使用 Apache 作为 SSL 的前端,使用 mod_rewrite 将 http 请求重定向到 https。这是我的配置:

默认-vhost.conf

<VirtualHost _default_:80>
  ServerName hg.mydomain.com
  ServerAdmin [email protected]
  ServerAlias rhodecode.mydomain.com

  DocumentRoot "/srv/www/htdocs"
  RewriteEngine On
  RewriteCond %{HTTPS} off
  RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
  HostnameLookups Off
  UseCanonicalName Off
  ServerSignature Off
  ....
</VirtualHost>

我使用 mod_rewrite 而不是重定向,因为我希望我的网站可以通过两个域名访问。我们有一个位于 hg.mydomain.com 的网站,我们计划用新的 rhodecode.mydomain.com 替换它,所以我想在重写规则中保留主机名。使用此指令:

Redirect permanent / https://hg.mydomain.com/

该网站运行正常,没有出现重定向问题。但是当我浏览时http://rhodecode.mydomain.com,我被重定向到另一个网站,我无法做到这一点,直到 hg.mydomain.com 上的网站被丢弃并且 hg.mydomain.com 指向与 rhodecode.mydomain.com 相同的 IP。

问题
Rhodecode 有时会在需要身份验证的操作的响应中包含 URL。例如,如果您是访客并尝试访问私人仓库,您将被重定向到登录屏幕,URL 如下:

https://rhodecode.mydomain.com/_admin/login?came_from=%2F

其中 %2F 是编码的“/”。
登录后,我被重定向到https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var,并显示 404 错误的 apache 默认页面。之后,浏览到https://rhodecode.mydomain.com/显示我正在站点会话中。为什么我会被重定向到那个奇怪的 HTTP_NOT_FOUND.html.var 文档?以下是我的其余配置和日志的相关部分:

默认-vhost-ssl.conf

<VirtualHost _default_:443>

  ServerName hg.mydomain.com
  ServerAdmin [email protected]
  ServerAlias rhodecode.mydomain.com

  DocumentRoot "/srv/www/htdocs"
  HostnameLookups Off
  UseCanonicalName Off
  ServerSignature Off

  SSLEngine on

  certificate stuff ...

  WSGIDaemonProcess hg.mydomain.com user=rhodecode group=users threads=5 \
  home=/home/rhodecode/rhodecode-env python-path=/home/rhodecode/rhodecode-env/lib/python2.7/site-packages
  WSGIScriptAlias / /home/rhodecode/rhodecode-env/dispatch.wsgi
  WSGIPassAuthorization On

  <Directory /home/rhodecode/rhodecode-env>
    WSGIProcessGroup hg.mydomain.com
    WSGIApplicationGroup %{GLOBAL}
    Order deny,allow
    Allow from all
  </Directory>

</VirtualHost>

重写日志

172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (2) init rewrite engine with requested uri /error/HTTP_NOT_FOUND.html.var
172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (3) applying pattern '(.*)' to uri '/error/HTTP_NOT_FOUND.html.var'
172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (4) RewriteCond: input='off' pattern='off' => matched
172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (2) rewrite '/error/HTTP_NOT_FOUND.html.var' -> 'https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var'
172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (2) explicitly forcing redirect with https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var
172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (1) escaping https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var for redirect
172.17.1.49 - - [04/Mar/2014:00:06:24 +0000] [rhodecode.mydomain.com/sid#7f6a03266f00][rid#7f69fd68d7a0/initial/redir#1] (1) redirect to https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var [REDIRECT/301]

注意 rid#7f69fd68d7a0/initial/重新目录#1部分。当我发送不带 %2F 的 URL 时,该部分不会出现在日志中。

访问日志

hg.mydomain.com:443 172.17.1.49 - - [04/Mar/2014:02:09:13 +0000] "POST /_admin/login?came_from=%252F HTTP/1.1" 302 186 "https://rhodecode.mydomain.com/_admin/login?came_from=%252F" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36"
hg.mydomain.com:80 172.17.1.49 - - [04/Mar/2014:02:09:14 +0000] "GET /_admin/%2F HTTP/1.1" 301 268 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36"
hg.mydomain.com:443 172.17.1.49 - - [04/Mar/2014:02:09:14 +0000] "GET /error/HTTP_NOT_FOUND.html.var HTTP/1.1" 200 1132 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36"
hg.mydomain.com:443 172.17.1.49 - - [04/Mar/2014:02:09:14 +0000] "GET /favicon.ico HTTP/1.1" 404 618 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36"

第一行是带有身份验证数据的 POST 请求。它成功了,所以它将我重定向到管理页面。在第二行中,你可以看到重定向位置将我送回 http(它进入了端口 80)到 /_admin/%2F,因此被重定向到 https,但神奇地转换为 /error/HTTP_NOT_FOUND.html.var

HTTP 请求 (为简洁起见,省略了一些标题)

POST /_admin/login?came_from=%252F HTTP/1.1
Host: rhodecode.mydomain.com
Cache-Control: no-cache
Pragma: no-cache
Origin: https://rhodecode.mydomain.com
Content-Type: application/x-www-form-urlencoded
Referer: https://rhodecode.mydomain.com/_admin/login?came_from=%252F
Cookie: rhodecode=3af58050ce87a93caa5a4c6809c5dacef4afb29d8e74b152c97f469199c554b6f67f7aa7
...

HTTP 响应

HTTP/1.1 302 Found
Date: Tue, 04 Mar 2014 02:24:11 GMT
Server: Apache/2.2.22 (Linux/SUSE)
Pragma: no-cache
Cache-Control: no-cache
Set-Cookie: rhodecode=f0a94a155738490da032b46354f4d72338902da2d69bc1177bcf4086aa8158f4719526e0; httponly; Path=/
Location: http://rhodecode.mydomain.com/_admin/%2F
...

HTTP 请求 2

GET /_admin/%2F HTTP/1.1
Host: rhodecode.mydomain.com
Cache-Control: no-cache
Pragma: no-cache
Cookie: rhodecode=f0a94a155738490da032b46354f4d72338902da2d69bc1177bcf4086aa8158f4719526e0
...

HTTP 响应 2

HTTP/1.1 301 Moved Permanently
Date: Tue, 04 Mar 2014 02:24:12 GMT
Server: Apache/2.2.22 (Linux/SUSE)
Location: https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var
...

最后,尝试获取 /error/HTTP_NOT_FOUND.html.var 并没有给我 404 错误,而是 200 OK 响应!
我以为浏览器在后台做了一些奇怪的事情,所以我发送了一个原始 HTTP 请求并得到了相同的结果:

[Rober@yue ~]$ nc rhodecode.mydomain.com 80
GET /%2F HTTP/1.1
Host: rhodecode.mydomain.com

HTTP/1.1 301 Moved Permanently
Date: Tue, 04 Mar 2014 00:07:26 GMT
Server: Apache/2.2.22 (Linux/SUSE)
Location: https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var
Content-Length: 268
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var">here</a>.</p>
</body></html>

查看响应中的位置标头。Apache 为什么这样做,而不是直接将请求更改为 https???

抱歉,我的问题太长了,但我想提供尽可能多的信息,以便您能帮助我调试这个问题:)

提前感谢大家!

编辑
按照建议,我想查明 Rhodecode 是否发送了重定向。因此,我将 wsgi 脚本更改为文档中显示的脚本:http://modwsgi.readthedocs.org/en/latest/configuration-guides/running-a-basic-application.html#wsgi-application-script-file。它始终返回“Hello World”,而没有重定向的可能性。我向网站发送了原始请求,结果相同,因此 Apache 一定以某种方式更改了 URL。结果如下:

GET /%2F HTTP/1.1
Host: rhodecode.mydomain.com

HTTP/1.1 301 Moved Permanently
Date: Thu, 06 Mar 2014 04:23:31 GMT
Server: Apache/2.2.22 (Linux/SUSE)
Location: https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var
Content-Length: 268
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var">here</a>.</p>
</body></html>

但是当我发送一个普通的 URL(没有 %2F)时,它会重定向 OK:

GET /abffr HTTP/1.1
Host: rhodecode.mydomain.com

HTTP/1.1 301 Moved Permanently
Date: Thu, 06 Mar 2014 04:25:19 GMT
Server: Apache/2.2.22 (Linux/SUSE)
Location: https://rhodecode.mydomain.com/abffr
Content-Length: 244
Content-Type: text/html; charset=iso-8859-1

Apache 似乎不喜欢 %2F 这个东西……

答案1

事实证明,有两个原因导致了此错误。首先是 AllowEncodedSlashes 的默认值,来自文档

With the default value, Off, URLs which contain encoded path separators (%2F for / and additionally %5C for \ on according systems) are refused with a 404 (Not found) error.

我通过打开 Apache 日志中的最大详细程度发现了这一点,并发现了以下内容:

[Mon Mar 10 04:53:43 2014] [info] [client 172.17.1.49] found %2f (encoded '/') in URI (decoded='//'), returning 404

因此,我所有带有 %2F 的请求都被拒绝了。此外,我的服务器默认有以下配置:

<IfModule mod_negotiation.c>
<IfModule mod_include.c>
    <Directory "/usr/share/apache2/error">
        AllowOverride None
        Options IncludesNoExec
        AddOutputFilter Includes html
        AddHandler type-map var
        Order allow,deny
        Allow from all
        LanguagePriority en cs de es fr it ja ko nl pl pt-br ro sv tr
        ForceLanguagePriority Prefer Fallback
    </Directory>

    ErrorDocument 400 /error/HTTP_BAD_REQUEST.html.var
    ErrorDocument 401 /error/HTTP_UNAUTHORIZED.html.var
    ErrorDocument 403 /error/HTTP_FORBIDDEN.html.var
    ErrorDocument 404 /error/HTTP_NOT_FOUND.html.var
    ErrorDocument 405 /error/HTTP_METHOD_NOT_ALLOWED.html.var
    ErrorDocument 408 /error/HTTP_REQUEST_TIME_OUT.html.var
    ErrorDocument 410 /error/HTTP_GONE.html.var
    ErrorDocument 411 /error/HTTP_LENGTH_REQUIRED.html.var
    ErrorDocument 412 /error/HTTP_PRECONDITION_FAILED.html.var
    ErrorDocument 413 /error/HTTP_REQUEST_ENTITY_TOO_LARGE.html.var
    ErrorDocument 414 /error/HTTP_REQUEST_URI_TOO_LARGE.html.var
    ErrorDocument 415 /error/HTTP_UNSUPPORTED_MEDIA_TYPE.html.var
    ErrorDocument 500 /error/HTTP_INTERNAL_SERVER_ERROR.html.var
    ErrorDocument 501 /error/HTTP_NOT_IMPLEMENTED.html.var
    ErrorDocument 502 /error/HTTP_BAD_GATEWAY.html.var
    ErrorDocument 503 /error/HTTP_SERVICE_UNAVAILABLE.html.var
    ErrorDocument 506 /error/HTTP_VARIANT_ALSO_VARIES.html.var
</IfModule>
</IfModule>

因此,每次向客户端发送 404 错误时,Includes 过滤器与指令相结合ErrorDocument 404 /error/HTTP_NOT_FOUND.html.var都会导致将重定向发送到浏览器,http://rhodecode.mydomain.com/error/HTTP_NOT_FOUND.html.var
您可以在我的重写日志中看到此行为,其中要重写的表达式使用 /error/HTTP_NOT_FOUND.html.var 初始化:

172.17.1.49 - - [10/Mar/2014:04:04:23 +0000] [rhodecode.mydomain.com/sid#7f98fedd7f00][rid#7f98f91fe7a0/initial/redir#1] (2) init rewrite engine with requested uri /error/HTTP_NOT_FOUND.html.var

另请注意重新目录#1表示发生了内部重定向。因此,我将配置更改为

RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
AllowEncodedSlashes NoDecode

并将 ifmodule 指令更改为

<IfModule !mod_include.c>

因此没有执行任何 ErrorDocument 指令,一切都运行正常!
这一次,重写日志在 url 中显示 %2F,并且没有发生 redir:

172.17.1.49 - - [10/Mar/2014:04:55:40 +0000] [rhodecode.mydomain.com/sid#7f9fd1f83f00][rid#7f9fcc3a90a0/initial] (2) init rewrite engine with requested uri /%2F

答案2

RhodeCode 还有一个特殊标志,告诉它您想要强制使用 ssl,这意味着所有重定向将始终转到 https。

在你的 .ini 标志更改中:

force_ssl = true

相关内容