Production resultsdb is still really slow and we're still seeing the occasional error on result posting so I'd like to bump the resources allocated to the wsgi app again.
+1s?
Tim
From 3d45155959cbdcde39c1a98a584a100b590761db Mon Sep 17 00:00:00 2001 From: Tim Flink tflink@fedoraproject.org Date: Fri, 17 Mar 2017 15:33:49 +0000 Subject: [PATCH] bumping resultsdb wsgi resources again
--- roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 b/roles/taskotron/resultsdb-ba index 97e73b9..c3f4d5c 100644 --- a/roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 +++ b/roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 @@ -1,5 +1,5 @@ {% if deployment_type in ['stg', 'prod'] %} -WSGIDaemonProcess resultsdb user=apache group=apache threads=100 processes=5 +WSGIDaemonProcess resultsdb user=apache group=apache threads=200 processes=20 {% else %} WSGIDaemonProcess resultsdb user=apache group=apache threads=5 {% endif %}
On Fri, Mar 17, 2017 at 09:41:02AM -0600, Tim Flink wrote:
Production resultsdb is still really slow and we're still seeing the occasional error on result posting so I'd like to bump the resources allocated to the wsgi app again.
I wonder if this doesn't saturate the server at one point
I'm +1 to apply because it's easy to revert but I'm not sure it's the right solution
Pierre
From 3d45155959cbdcde39c1a98a584a100b590761db Mon Sep 17 00:00:00 2001 From: Tim Flink tflink@fedoraproject.org Date: Fri, 17 Mar 2017 15:33:49 +0000 Subject: [PATCH] bumping resultsdb wsgi resources again
roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 b/roles/taskotron/resultsdb-ba index 97e73b9..c3f4d5c 100644 --- a/roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 +++ b/roles/taskotron/resultsdb-backend/templates/resultsdb.conf.j2 @@ -1,5 +1,5 @@ {% if deployment_type in ['stg', 'prod'] %} -WSGIDaemonProcess resultsdb user=apache group=apache threads=100 processes=5 +WSGIDaemonProcess resultsdb user=apache group=apache threads=200 processes=20 {% else %} WSGIDaemonProcess resultsdb user=apache group=apache threads=5 {% endif %} -- 1.8.3.1
For information, we have had to revert this change. The amount to which processes and threads were raised by this and the frontend patch are non-usable: it would spawn 20 process * 200 threads (api) + 10 processes * 100 threads (frontend) = 5000 threads. This is more than mod_wsgi is willing/able to manage with resource limits put in by the system (like e.g. max 4096 files open as hard limit). This was causing lots of the threads to be unable to be spawned, resulting in many errors filling the logs.
I think that instead of bumping these, we need to analyze deeper what is causing the problems with resultsdb in responding to many requests at the same time.
The section on threads= in the mod-wsgi documentation (https://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIDaemo...):
Do not get carried away and set this to a very large number in the belief that it will somehow magically enable you to handle many more concurrent users. Any sort of increased value would only be appropriate where your code is I/O bound. If you code is CPU bound, you are better of using at most 3 to 5 threads per process and using more processes.
infrastructure@lists.fedoraproject.org