5 messages in ru.sysoev.nginxRe: Re: weird anchor bug
FromSent OnAttachments
Jonathan DanceOct 27, 2006 3:12 pm 
Jonathan DanceOct 27, 2006 3:14 pm 
Jonathan VanascoOct 27, 2006 3:29 pm 
Jonathan DanceOct 27, 2006 3:54 pm 
Igor SysoevOct 28, 2006 3:14 am.txt
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: Re: weird anchor bugActions...
From:Igor Sysoev (is-G@public.gmane.org)
Date:Oct 28, 2006 3:14:34 am
List:ru.sysoev.nginx
Attachments:

On Fri, 27 Oct 2006, Jonathan Dance wrote:

Because this bug could affect a large number of backends (cgi/fastcgi/proxy), nginx should remove the anchor part of the URL before passing it on to any other service.

That sounds like something someone should do in a config file , to match with certain versions of IE

# is invalid... it should be encoded and sent as a %23 -- but i can see the possibility of breaking more browsers by stripping it, than by leaving it in and just having a user-defined browser match regex for ie

Yes, it is invalid, and since it is invalid, the theoretical correct action should be a 400 Bad Request, but since this is production IE bug, that would definitely break things. If the # was re-interpreted as %23, it would cause a 404 error. It's also noteworthy that you get a 404 if you are requesting a local file with nginx (I assume it looks for the file "#bar" instead of the index for a request to "/#bar"). The only real solution is to strip off that portion of the URL.

FWIW, Apache2.2 seems to strips off the anchor to handle this case. That certainly doesn't make me right, nginx isn't apache. (My test was simply testing a production site I know runs Apache2.2 + Mongrel + Rails and I didn't get an error.)

You're right, it can be a config file thing - I will add stripping of anchors regardless of browser (because it should never be sent) - but unless *everyone* has it in their config file I think it's still a problem - people will blame nginx when they run across this.

Thank you. The attached patch deletes "#fragment" from all nginx internals except $request_line and $request_uri (unparsed URIs). It seems that Apache did the same thing since 1.3b2. Despite the name (patch-0.4.11.1.txt) the patch can be applied to the most modern nginx versions.

Igor Sysoev http://sysoev.ru/en/

Index: src/http/ngx_http_parse.c =================================================================== --- src/http/ngx_http_parse.c (revision 139) +++ src/http/ngx_http_parse.c (working copy) @@ -282,6 +282,10 @@ r->args_start = p + 1; state = sw_uri; break; + case '#': + r->complex_uri = 1; + state = sw_uri; + break; case '+': r->plus_in_uri = 1; break; @@ -341,6 +345,10 @@ r->args_start = p + 1; state = sw_uri; break; + case '#': + r->complex_uri = 1; + state = sw_uri; + break; case '+': r->plus_in_uri = 1; break; @@ -366,6 +374,9 @@ r->uri_end = p; r->http_minor = 9; goto done; + case '#': + r->complex_uri = 1; + break; case '\0': r->zero_in_uri = 1; break; @@ -822,6 +833,8 @@ break; case '?': r->args_start = p; + goto args; + case '#': goto done; case '.': r->uri_ext = u + 1; @@ -853,6 +866,8 @@ break; case '?': r->args_start = p; + goto args; + case '#': goto done; case '+': r->plus_in_uri = 1; @@ -883,6 +898,8 @@ break; case '?': r->args_start = p; + goto args; + case '#': goto done; case '+': r->plus_in_uri = 1; @@ -915,6 +932,8 @@ break; case '?': r->args_start = p; + goto args; + case '#': goto done; #if (NGX_WIN32) case '.': @@ -958,6 +977,8 @@ break; case '?': r->args_start = p; + goto args; + case '#': goto done; case '+': r->plus_in_uri = 1; @@ -1001,7 +1022,11 @@ break; }

- if (ch == '\0') { + if (ch == '#') { + *u++ = ch; + ch = *p++; + + } else if (ch == '\0') { r->zero_in_uri = 1; }

@@ -1041,6 +1066,31 @@ r->uri_ext = NULL;

return NGX_OK; + +args: + + while (p < r->uri_end) { + if (*p++ != '#') { + continue; + } + + r->args.len = p - 1 - r->args_start; + r->args.data = r->args_start; + r->args_start = NULL; + + break; + } + + r->uri.len = u - r->uri.data; + + if (r->uri_ext) { + r->exten.len = u - r->uri_ext; + r->exten.data = r->uri_ext; + } + + r->uri_ext = NULL; + + return NGX_OK; }