We are getting some mirrorlist requests with escape characters in them such as \xe2 . While I've taken steps to deal with these in the mirrorlist code, at least one client makes such a request hourly, and they are causing the mirrorlist WSGI process to spin. I can't recreate the failure, even using the same request URI, and the fixes I've tried haven't avoided them all.
I'd like to block such requests at the proxy, to prevent them from making it all the way to MM. It's a hack, but I'm at a loss for another solution right now.
diff --git a/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb b/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb index e52c926..95792fe 100644 --- a/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb +++ b/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb @@ -17,6 +17,10 @@ RewriteEngine On RewriteCond %{QUERY_STRING} repo=epel-5&arch=$basea$ RewriteRule ^/mirrorlist - [F] # END hack +# BEGIN hack for escaped chars +RewriteCond %{QUERY_STRING} \x +RewriteRule ^/(mirrorlist|metalink) - [F] +# END hack RewriteRule ^/publiclist(.*) <%= proxyurl %>/publiclist/$1 [P,L] RewriteRule ^/mirrorlist(.*) <%= proxyurl %>/mirrorlist$1 [P,L] RewriteRule ^/metalink(.*) <%= proxyurl %>/metalink$1 [P,L]
On Wed, May 12, 2010 at 5:41 PM, Matt Domsch Matt_Domsch@dell.com wrote:
We are getting some mirrorlist requests with escape characters in them such as \xe2 . While I've taken steps to deal with these in the mirrorlist code, at least one client makes such a request hourly, and they are causing the mirrorlist WSGI process to spin. I can't recreate the failure, even using the same request URI, and the fixes I've tried haven't avoided them all.
I'd like to block such requests at the proxy, to prevent them from making it all the way to MM. It's a hack, but I'm at a loss for another solution right now.
diff --git a/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb b/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb index e52c926..95792fe 100644 --- a/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb +++ b/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb @@ -17,6 +17,10 @@ RewriteEngine On RewriteCond %{QUERY_STRING} repo=epel-5&arch=$basea$ RewriteRule ^/mirrorlist - [F] # END hack +# BEGIN hack for escaped chars +RewriteCond %{QUERY_STRING} \x +RewriteRule ^/(mirrorlist|metalink) - [F] +# END hack RewriteRule ^/publiclist(.*) <%= proxyurl %>/publiclist/$1 [P,L] RewriteRule ^/mirrorlist(.*) <%= proxyurl %>/mirrorlist$1 [P,L] RewriteRule ^/metalink(.*) <%= proxyurl %>/metalink$1 [P,L]
Hey is there a way to test this on staging first to make sure it grabs the URLs. RewriteRules's make my head hurt and I get things backwards all the time.
Hey is there a way to test this on staging first to make sure it grabs the URLs. RewriteRules's make my head hurt and I get things backwards all the time.
In staging, I can test whether or not \x gets caught. I can't test whether it'll catch the ones that are causing the problems from outside users though. (If I could, I would have been able to find and fix the root cause, which clearly this isn't).
I'm open to other ideas. Perhaps what I see in the URL logs isn't what's actually being sent? I don't know...
On Wed, May 12, 2010 at 08:56:12PM -0500, Matt Domsch wrote:
Hey is there a way to test this on staging first to make sure it grabs the URLs. RewriteRules's make my head hurt and I get things backwards all the time.
In staging, I can test whether or not \x gets caught. I can't test whether it'll catch the ones that are causing the problems from outside users though. (If I could, I would have been able to find and fix the root cause, which clearly this isn't).
I'm open to other ideas. Perhaps what I see in the URL logs isn't what's actually being sent? I don't know...
Here's the symptom. admin.fp.o/status/app01
16-0 30536 0/1113/1113 W 18.42 11020 0 0.0 10.07 10.07 192.168.1.7 app01.phx2.fedoraproject.org GET /mirrorlist?repo=\xc2\xadfedora-8&arch=i386 HTTP/1.0 63-0 30713 0/2151/2151 W 38.47 3819 0 0.0 21.13 21.13 192.168.1.7 app01.phx2.fedoraproject.org GET /mirrorlist?repo=\xc2\xadfedora-8&arch=i386 HTTP/1.0
In both cases, the jobs are in W state from apache's POV (so responding to the client request), see how long they've been sitting there (11020 and 3819 seconds respectively), and the format of the query args. I send the same thing via wget, and it succeeds, no failure.
On Wed, May 12, 2010 at 08:59:18PM -0500, Matt Domsch wrote:
On Wed, May 12, 2010 at 08:56:12PM -0500, Matt Domsch wrote:
Hey is there a way to test this on staging first to make sure it grabs the URLs. RewriteRules's make my head hurt and I get things backwards all the time.
In staging, I can test whether or not \x gets caught. I can't test whether it'll catch the ones that are causing the problems from outside users though. (If I could, I would have been able to find and fix the root cause, which clearly this isn't).
I'm open to other ideas. Perhaps what I see in the URL logs isn't what's actually being sent? I don't know...
Here's the symptom. admin.fp.o/status/app01
16-0 30536 0/1113/1113 W 18.42 11020 0 0.0 10.07 10.07 192.168.1.7 app01.phx2.fedoraproject.org GET /mirrorlist?repo=\xc2\xadfedora-8&arch=i386 HTTP/1.0 63-0 30713 0/2151/2151 W 38.47 3819 0 0.0 21.13 21.13 192.168.1.7 app01.phx2.fedoraproject.org GET /mirrorlist?repo=\xc2\xadfedora-8&arch=i386 HTTP/1.0
In both cases, the jobs are in W state from apache's POV (so responding to the client request), see how long they've been sitting there (11020 and 3819 seconds respectively), and the format of the query args. I send the same thing via wget, and it succeeds, no failure.
OK, having spent a few more hours, I don't need the RewriteRule. Yea!
I just need to convert to unicode, and leave it in unicode. Malformed requests fail lookups as would be expected and return the "you asked for an invalid repo or arch" message. Good requests return the mirrorlist. All is swimmingly.
commit 9251a20e8ff8239bae2c74e64ad20a4645423ae9 Author: Matt Domsch Matt_Domsch@dell.com Date: Wed May 12 22:59:46 2010 -0500
mirrorlist_client: leave query params as utf8
diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi index 3508f19..36077a1 100755 --- a/mirrorlist-server/mirrorlist_client.wsgi +++ b/mirrorlist-server/mirrorlist_client.wsgi @@ -93,7 +93,7 @@ def request_setup(request):
for k, v in d.iteritems(): try: - d[k] = unicode(v, 'utf8', 'ignore').encode('utf8') + d[k] = unicode(v, 'utf8', 'replace') except: pass return d
On May 13, 2010, at 12:05 AM, Matt Domsch wrote:
OK, having spent a few more hours, I don't need the RewriteRule. Yea!
I just need to convert to unicode, and leave it in unicode. Malformed requests fail lookups as would be expected and return the "you asked for an invalid repo or arch" message. Good requests return the mirrorlist. All is swimmingly.
commit 9251a20e8ff8239bae2c74e64ad20a4645423ae9 Author: Matt Domsch Matt_Domsch@dell.com Date: Wed May 12 22:59:46 2010 -0500
mirrorlist_client: leave query params as utf8
diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi index 3508f19..36077a1 100755 --- a/mirrorlist-server/mirrorlist_client.wsgi +++ b/mirrorlist-server/mirrorlist_client.wsgi @@ -93,7 +93,7 @@ def request_setup(request):
for k, v in d.iteritems(): try:
d[k] = unicode(v, 'utf8', 'ignore').encode('utf8')
return dd[k] = unicode(v, 'utf8', 'replace') except: pass
+1
On Wed, May 12, 2010 at 11:05:33PM -0500, Matt Domsch wrote:
On Wed, May 12, 2010 at 08:59:18PM -0500, Matt Domsch wrote:
On Wed, May 12, 2010 at 08:56:12PM -0500, Matt Domsch wrote:
Hey is there a way to test this on staging first to make sure it grabs the URLs. RewriteRules's make my head hurt and I get things backwards all the time.
In staging, I can test whether or not \x gets caught. I can't test whether it'll catch the ones that are causing the problems from outside users though. (If I could, I would have been able to find and fix the root cause, which clearly this isn't).
I'm open to other ideas. Perhaps what I see in the URL logs isn't what's actually being sent? I don't know...
Here's the symptom. admin.fp.o/status/app01
16-0 30536 0/1113/1113 W 18.42 11020 0 0.0 10.07 10.07 192.168.1.7 app01.phx2.fedoraproject.org GET /mirrorlist?repo=\xc2\xadfedora-8&arch=i386 HTTP/1.0 63-0 30713 0/2151/2151 W 38.47 3819 0 0.0 21.13 21.13 192.168.1.7 app01.phx2.fedoraproject.org GET /mirrorlist?repo=\xc2\xadfedora-8&arch=i386 HTTP/1.0
In both cases, the jobs are in W state from apache's POV (so responding to the client request), see how long they've been sitting there (11020 and 3819 seconds respectively), and the format of the query args. I send the same thing via wget, and it succeeds, no failure.
OK, having spent a few more hours, I don't need the RewriteRule. Yea!
I just need to convert to unicode, and leave it in unicode. Malformed requests fail lookups as would be expected and return the "you asked for an invalid repo or arch" message. Good requests return the mirrorlist. All is swimmingly.
commit 9251a20e8ff8239bae2c74e64ad20a4645423ae9 Author: Matt Domsch Matt_Domsch@dell.com Date: Wed May 12 22:59:46 2010 -0500
mirrorlist_client: leave query params as utf8
diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi index 3508f19..36077a1 100755 --- a/mirrorlist-server/mirrorlist_client.wsgi +++ b/mirrorlist-server/mirrorlist_client.wsgi @@ -93,7 +93,7 @@ def request_setup(request):
for k, v in d.iteritems(): try:
d[k] = unicode(v, 'utf8', 'ignore').encode('utf8')
return dd[k] = unicode(v, 'utf8', 'replace') except: pass
+1
-Toshio
On 2010-05-12 11:05:33 PM, Matt Domsch wrote:
OK, having spent a few more hours, I don't need the RewriteRule. Yea!
I just need to convert to unicode, and leave it in unicode. Malformed requests fail lookups as would be expected and return the "you asked for an invalid repo or arch" message. Good requests return the mirrorlist. All is swimmingly.
commit 9251a20e8ff8239bae2c74e64ad20a4645423ae9 Author: Matt Domsch Matt_Domsch@dell.com Date: Wed May 12 22:59:46 2010 -0500
mirrorlist_client: leave query params as utf8
diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi index 3508f19..36077a1 100755 --- a/mirrorlist-server/mirrorlist_client.wsgi +++ b/mirrorlist-server/mirrorlist_client.wsgi @@ -93,7 +93,7 @@ def request_setup(request):
for k, v in d.iteritems(): try:
d[k] = unicode(v, 'utf8', 'ignore').encode('utf8')
return dd[k] = unicode(v, 'utf8', 'replace') except: pass
+1
Thanks for looking at this, Ricky
infrastructure@lists.fedoraproject.org