This is mostly for the TurboGears guys in the group. We've discussed this a little in the past but I wanted to get something down for sure before all of our stuff goes live. Whats going to be our official deployment method? Personally I'd vote mod_python though I haven't actually done this yet. Does using mod_python still require a proxypass to a tg port? I'd tend towards mod_python just because it would behave just like the rest of our apps do though I know toshio has some neat script that makes it behave that way too. What do you guys think? mod_python may be too complex for what we're trying to accomplish.
-Mike
On Sat, 2007-03-03 at 11:43 -0600, Mike McGrath wrote:
This is mostly for the TurboGears guys in the group. We've discussed this a little in the past but I wanted to get something down for sure before all of our stuff goes live. Whats going to be our official deployment method? Personally I'd vote mod_python though I haven't actually done this yet. Does using mod_python still require a proxypass to a tg port? I'd tend towards mod_python just because it would behave just like the rest of our apps do though I know toshio has some neat script that makes it behave that way too. What do you guys think? mod_python may be too complex for what we're trying to accomplish.
Just some random thoughts as I'm running out the door:
I have never gotten mod_python to work with a cherrypy/tg based application by following the documentation on either project's wikis. That said, I haven't tried with TG since 0.8 so perhaps the process (or just the documentation) is better now.
I like the way I'm doing it because I've got it to a state where it pretty much just works but if we can do that with mod_python as well, that would be fine. My method is basically Apache proxypassing to the turbogears application server. If the tg server isn't running, the proxypass error handler loads a small cgi script that starts the turbogears app and once the app responds, sends the user there.
TurboGears is trying to become more WSGI compliant. TG-1.1 is supposed to use cherrypy-3.0 which has a builtin mod_python->WSGI gateway. That should make mod_python deployment simpler than with cherrypy-2.2.
So I think -- if we can make mod_python easy to deploy, that seems the way to go. If we can't then I've got something that will work until TG-1.1 and a more integrated WSGI implementation.
Tangentially: We probably want to preserve the ability to run our apps on several different servers. Because python libraries don't do versioning (well -- we may be able to do it with setuptools and eggs, but that's a long term, distro-wide change.) we can enter situations where some web apps depend on TG-1.0 and others on TG-1.1 (or sqlobject or python-urlgrabber or...) Being able to proxy to a xen host during transition periods rather than having to upgrade all our web apps at once is probably a good thing.
-Toshio
On Sat, 2007-03-03 at 11:20 -0800, Toshio Kuratomi wrote:
On Sat, 2007-03-03 at 11:43 -0600, Mike McGrath wrote:
This is mostly for the TurboGears guys in the group. We've discussed this a little in the past but I wanted to get something down for sure before all of our stuff goes live. Whats going to be our official deployment method? Personally I'd vote mod_python though I haven't actually done this yet. Does using mod_python still require a proxypass to a tg port? I'd tend towards mod_python just because it would behave just like the rest of our apps do though I know toshio has some neat script that makes it behave that way too. What do you guys think? mod_python may be too complex for what we're trying to accomplish.
Just some random thoughts as I'm running out the door:
I have never gotten mod_python to work with a cherrypy/tg based application by following the documentation on either project's wikis. That said, I haven't tried with TG since 0.8 so perhaps the process (or just the documentation) is better now.
I like the way I'm doing it because I've got it to a state where it pretty much just works but if we can do that with mod_python as well, that would be fine. My method is basically Apache proxypassing to the turbogears application server. If the tg server isn't running, the proxypass error handler loads a small cgi script that starts the turbogears app and once the app responds, sends the user there.
TurboGears is trying to become more WSGI compliant. TG-1.1 is supposed to use cherrypy-3.0 which has a builtin mod_python->WSGI gateway. That should make mod_python deployment simpler than with cherrypy-2.2.
So I think -- if we can make mod_python easy to deploy, that seems the way to go. If we can't then I've got something that will work until TG-1.1 and a more integrated WSGI implementation.
Tangentially: We probably want to preserve the ability to run our apps on several different servers. Because python libraries don't do versioning (well -- we may be able to do it with setuptools and eggs, but that's a long term, distro-wide change.) we can enter situations where some web apps depend on TG-1.0 and others on TG-1.1 (or sqlobject or python-urlgrabber or...) Being able to proxy to a xen host during transition periods rather than having to upgrade all our web apps at once is probably a good thing.
How performant is the tg server? In the past the python webserver was not exactly a barn burner when it came to performance. It worked, but it didn't hold up well under heavy load. Having apache in front helps but just like with zope, if the app is slow, the app is slow.
any load testing done, yet?
-sv
On Sat, 2007-03-03 at 17:31 -0500, seth vidal wrote:
On Sat, 2007-03-03 at 11:20 -0800, Toshio Kuratomi wrote:
On Sat, 2007-03-03 at 11:43 -0600, Mike McGrath wrote:
This is mostly for the TurboGears guys in the group. We've discussed this a little in the past but I wanted to get something down for sure before all of our stuff goes live. Whats going to be our official deployment method? Personally I'd vote mod_python though I haven't actually done this yet. Does using mod_python still require a proxypass to a tg port? I'd tend towards mod_python just because it would behave just like the rest of our apps do though I know toshio has some neat script that makes it behave that way too. What do you guys think? mod_python may be too complex for what we're trying to accomplish.
Just some random thoughts as I'm running out the door:
I have never gotten mod_python to work with a cherrypy/tg based application by following the documentation on either project's wikis. That said, I haven't tried with TG since 0.8 so perhaps the process (or just the documentation) is better now.
I like the way I'm doing it because I've got it to a state where it pretty much just works but if we can do that with mod_python as well, that would be fine. My method is basically Apache proxypassing to the turbogears application server. If the tg server isn't running, the proxypass error handler loads a small cgi script that starts the turbogears app and once the app responds, sends the user there.
TurboGears is trying to become more WSGI compliant. TG-1.1 is supposed to use cherrypy-3.0 which has a builtin mod_python->WSGI gateway. That should make mod_python deployment simpler than with cherrypy-2.2.
So I think -- if we can make mod_python easy to deploy, that seems the way to go. If we can't then I've got something that will work until TG-1.1 and a more integrated WSGI implementation.
Tangentially: We probably want to preserve the ability to run our apps on several different servers. Because python libraries don't do versioning (well -- we may be able to do it with setuptools and eggs, but that's a long term, distro-wide change.) we can enter situations where some web apps depend on TG-1.0 and others on TG-1.1 (or sqlobject or python-urlgrabber or...) Being able to proxy to a xen host during transition periods rather than having to upgrade all our web apps at once is probably a good thing.
How performant is the tg server? In the past the python webserver was not exactly a barn burner when it came to performance. It worked, but it didn't hold up well under heavy load. Having apache in front helps but just like with zope, if the app is slow, the app is slow.
any load testing done, yet?
There are some benchmarks for cherrpy-2.0 (TG-1.0 uses cherrypy2.2, TG-1.1 will use cherrypy-3.0)
http://www.cherrypy.org/wiki/CherryPySpeed
http://docs.cherrypy.org/recommended-setup-for-production-websites Gives some stats for page requests from behind apache vs having the cherrypy server directly exposed.
http://www.cherrypy.org/wiki/WhatsNewIn30 Says that cherrypy3 is "as much as three times faster in benchmarks" as cherrypy2 but I haven't seen the benchmarks.
-Toshio
On Sat, Mar 03, 2007 at 05:31:07PM -0500, seth vidal wrote:
How performant is the tg server? In the past the python webserver was not exactly a barn burner when it came to performance. It worked, but it didn't hold up well under heavy load. Having apache in front helps but just like with zope, if the app is slow, the app is slow.
any load testing done, yet?
AFAIK, no load testing has been done with our TurboGears apps, but I'm definitely in favor of doing it before F7. I recall kim0 and paulobanon did some load tests in the past, but I haven't seen the results yet. Does anyone recommend any load generation tools?
A presentation was given at this years PyCon called "Scaling Python for High-Load Web Sites"[0], I definitely recommend checking it out.
I recommend that we load balance dynamic page requests from our proxy servers to our application servers, and let the proxies serve out cached static content. We definitely want to hide hide CherryPy behind apache, because having HTTP/1.1 and SSL support is nice, among many other benefits. Whether or not we use mod_{python,proxy,rewrite} to connect to CherryPy is up for discussion. mod_python is the fastest option, and the only downfall really is that it is harder to configure, and that you have to restart Apache every time you cange your CherryPy code. I give a +1 for mod_python, at least until WSGI support in CherryPy solidifies.
Since each application server will have its own connection pool with the db servers, increasing our scalability will simply consist of adding another Xen guest behind our load balancer.
So from here we might want to look into creating a standard guest image optimized for our TurboGears Xen guests. publictest2 was running FC6 (it still might be, but as far as I can tell it seems to be down), and I'm not sure what our other TG systems are running, but I think we should be consistent. I tend to lean towards RHEL{5,4}, which will help us get TurboGears & friends whipped into shape for EPEL
What do you all think?
luke
On Sun, 2007-03-04 at 20:23 -0500, Luke Macken wrote:
On Sat, Mar 03, 2007 at 05:31:07PM -0500, seth vidal wrote:
How performant is the tg server? In the past the python webserver was not exactly a barn burner when it came to performance. It worked, but it didn't hold up well under heavy load. Having apache in front helps but just like with zope, if the app is slow, the app is slow.
any load testing done, yet?
AFAIK, no load testing has been done with our TurboGears apps, but I'm definitely in favor of doing it before F7. I recall kim0 and paulobanon did some load tests in the past, but I haven't seen the results yet. Does anyone recommend any load generation tools?
A presentation was given at this years PyCon called "Scaling Python for High-Load Web Sites"[0], I definitely recommend checking it out.
Really cool. My reading of the talk is if our loads match up with their sample application then we're probably okay with just a single cherrypy instance behind apache for nearly everything. Load balancing could get us the rest of the way for all of our "internal" apps (meaning: Apps meant for contributors to the project rather than the Fedora Userbase.) Of course, in your proposal, once we have one thing behind the load balancers we should be able to put everything behind the load balancers without too much effort.
The wiki/plone, bugzilla and other end-user facing applications need more than that. Unfortunately, we aren't in charge of coding those so we don't have as many choices in terms of getting it to scale at the moment. With moin moin, for instance, my impression is that moin wouldn't be able to lock files if we had two instances running so we're unable to use load balancing as an optimization.
I recommend that we load balance dynamic page requests from our proxy servers to our application servers, and let the proxies serve out cached static content. We definitely want to hide hide CherryPy behind apache, because having HTTP/1.1 and SSL support is nice, among many other benefits. Whether or not we use mod_{python,proxy,rewrite} to connect to CherryPy is up for discussion. mod_python is the fastest option, and the only downfall really is that it is harder to configure, and that you have to restart Apache every time you cange your CherryPy code. I give a +1 for mod_python, at least until WSGI support in CherryPy solidifies.
It appears that TG + mod_python is very slow ATM:: http://tinyurl.com/3xyznr
Since each application server will have its own connection pool with the db servers, increasing our scalability will simply consist of adding another Xen guest behind our load balancer.
Why do we even need to add Xen guests? From the pycon talk it looked like just adding additional cherrypy servers would increase our ability to serve more pages.
We'd want to run benchmarks to see but I'd suspect that having one guest with five cherrypy instances that we load balance between will give us more bang for the resources used than five guests on the same Xen host running one cherrypy server apiece.
Additional guests could enhance reliability, though. If our load balancer detects whether a guest has stopped responding and serves requests to the other guests that are running the cherrypy servers, we could take a guest down for maintenance and then return it to the pool without interrupting service. Having them on separate Xen hosts would mean we could lose a physical machine and still survive (at half capacity).
So from here we might want to look into creating a standard guest image optimized for our TurboGears Xen guests. publictest2 was running FC6 (it still might be, but as far as I can tell it seems to be down), and I'm not sure what our other TG systems are running, but I think we should be consistent. I tend to lean towards RHEL{5,4}, which will help us get TurboGears & friends whipped into shape for EPEL
RHEL4 would be python2.3. RHEL5 is python2.4 like FC6. F7 will be python2.5....
python2.4 has decorators which TG makes heavy use of so I think we want to have at least that version. It'll feel constraining to run 2.5 for local development on our home machines with Fedora7+ but having to develop for python 2.4 because that's what comes with RHEL5 (Unified try:except:finally and ternary operators being the features I'll miss the most) but I suspect that's a tradeoff that we'll want to make so we aren't upgrading every six months.
-Toshio
What do you all think?
luke
On Mon, Mar 05, 2007 at 05:14:51PM -0800, Toshio Kuratomi wrote:
A presentation was given at this years PyCon called "Scaling Python for High-Load Web Sites"[0], I definitely recommend checking it out.
Really cool. My reading of the talk is if our loads match up with their sample application then we're probably okay with just a single cherrypy instance behind apache for nearly everything. Load balancing could get us the rest of the way for all of our "internal" apps (meaning: Apps meant for contributors to the project rather than the Fedora Userbase.) Of course, in your proposal, once we have one thing behind the load balancers we should be able to put everything behind the load balancers without too much effort.
The wiki/plone, bugzilla and other end-user facing applications need more than that. Unfortunately, we aren't in charge of coding those so we don't have as many choices in terms of getting it to scale at the moment. With moin moin, for instance, my impression is that moin wouldn't be able to lock files if we had two instances running so we're unable to use load balancing as an optimization.
Yeah, I agree that we definitely need to work on optimizing some of our current software; I mean, seriously, have you tried saving a Wiki page lately ?
I recommend that we load balance dynamic page requests from our proxy servers to our application servers, and let the proxies serve out cached static content. We definitely want to hide hide CherryPy behind apache, because having HTTP/1.1 and SSL support is nice, among many other benefits. Whether or not we use mod_{python,proxy,rewrite} to connect to CherryPy is up for discussion. mod_python is the fastest option, and the only downfall really is that it is harder to configure, and that you have to restart Apache every time you cange your CherryPy code. I give a +1 for mod_python, at least until WSGI support in CherryPy solidifies.
It appears that TG + mod_python is very slow ATM:: http://tinyurl.com/3xyznr
Interesting.
To get a better idea of the performance of the TurboGears stack in our infrastructure, I think it would be extremely valuable to perform some stress tests before F7. This way, we can know for sure the best options for our needs, with regard to:
o Apache mod_{rewrite,python,proxy} o SQL{Object,Alchemy} o Xen instances vs. CherryPy instances
If anyone is interested in heading this up (as my stress-testing-fu is weak), I would definitely be willing to help out.
Since each application server will have its own connection pool with the db servers, increasing our scalability will simply consist of adding another Xen guest behind our load balancer.
Why do we even need to add Xen guests? From the pycon talk it looked like just adding additional cherrypy servers would increase our ability to serve more pages.
True.
We'd want to run benchmarks to see but I'd suspect that having one guest with five cherrypy instances that we load balance between will give us more bang for the resources used than five guests on the same Xen host running one cherrypy server apiece.
Yeah, I think that benchmarking this will yield extremely useful data that would benefit many.
Additional guests could enhance reliability, though. If our load balancer detects whether a guest has stopped responding and serves requests to the other guests that are running the cherrypy servers, we could take a guest down for maintenance and then return it to the pool without interrupting service. Having them on separate Xen hosts would mean we could lose a physical machine and still survive (at half capacity).
Yep, this will help mitigate much suffering on our end :)
So from here we might want to look into creating a standard guest image optimized for our TurboGears Xen guests. publictest2 was running FC6 (it still might be, but as far as I can tell it seems to be down), and I'm not sure what our other TG systems are running, but I think we should be consistent. I tend to lean towards RHEL{5,4}, which will help us get TurboGears & friends whipped into shape for EPEL
RHEL4 would be python2.3. RHEL5 is python2.4 like FC6. F7 will be python2.5....
python2.4 has decorators which TG makes heavy use of so I think we want to have at least that version. It'll feel constraining to run 2.5 for local development on our home machines with Fedora7+ but having to develop for python 2.4 because that's what comes with RHEL5 (Unified try:except:finally and ternary operators being the features I'll miss the most) but I suspect that's a tradeoff that we'll want to make so we aren't upgrading every six months.
I have yet to start utilizing any Python 2.5 features in my code, so I'm not really partial either way.
luke
Luke Macken wrote:
Interesting.
To get a better idea of the performance of the TurboGears stack in our infrastructure, I think it would be extremely valuable to perform some stress tests before F7. This way, we can know for sure the best options for our needs, with regard to:
o Apache mod_{rewrite,python,proxy} o SQL{Object,Alchemy} o Xen instances vs. CherryPy instances
If anyone is interested in heading this up (as my stress-testing-fu is weak), I would definitely be willing to help out.
This is a must. We should always use more CherryPy instances over Xen instances when possible. At least when talking about the same physical hardware and security is not an issue.
-Mike
Toshio Kuratomi wrote:
Just some random thoughts as I'm running out the door:
I like the way I'm doing it because I've got it to a state where it pretty much just works but if we can do that with mod_python as well, that would be fine. My method is basically Apache proxypassing to the turbogears application server. If the tg server isn't running, the proxypass error handler loads a small cgi script that starts the turbogears app and once the app responds, sends the user there.
You know me, I love "just works". Is there any concern with your script and thrashing? For example if tg can't start but the page is under heavy load would the startup attempts have the ability to take other services down?
-Mike
On Sun, 2007-03-04 at 00:07 -0600, Mike McGrath wrote:
Toshio Kuratomi wrote:
Just some random thoughts as I'm running out the door:
I like the way I'm doing it because I've got it to a state where it pretty much just works but if we can do that with mod_python as well, that would be fine. My method is basically Apache proxypassing to the turbogears application server. If the tg server isn't running, the proxypass error handler loads a small cgi script that starts the turbogears app and once the app responds, sends the user there.
You know me, I love "just works". Is there any concern with your script and thrashing? For example if tg can't start but the page is under heavy load would the startup attempts have the ability to take other services down?
That could be a problem case. If we decide to stay with the cgi script, I can write a timestamp to a file. When the autostart cgi starts, check that the timestamp is older than one minute otherwise don't attempt to start.
-Toshio
On Sat, 2007-03-03 at 11:43 -0600, Mike McGrath wrote:
This is mostly for the TurboGears guys in the group. We've discussed this a little in the past but I wanted to get something down for sure before all of our stuff goes live. Whats going to be our official deployment method? Personally I'd vote mod_python though I haven't actually done this yet. Does using mod_python still require a proxypass to a tg port? I'd tend towards mod_python just because it would behave just like the rest of our apps do though I know toshio has some neat script that makes it behave that way too. What do you guys think? mod_python may be too complex for what we're trying to accomplish.
http://thraxil.org/users/anders/posts/2006/09/13/TurboGears-Deployment-with-...
An interesting blog post about why TG w/mod_python didn't fit this person's needs -- his recommendation is to use supervisord to start the apps:: http://www.plope.com/software/supervisor2/
-Toshio
infrastructure@lists.fedoraproject.org