45 messages in com.googlegroups.pylons-discussRe: Reducing pylons app memory usage?
FromSent OnAttachments
Marcin Kasperski16 Oct 2007 07:21 
Max Ischenko16 Oct 2007 10:46 
Marcin Kasperski17 Oct 2007 02:59 
Dalius Dobravolskas17 Oct 2007 03:04 
Christoph Haas17 Oct 2007 07:12 
Max Ischenko17 Oct 2007 11:02 
Ben Bangert17 Oct 2007 13:19 
Christoph Haas17 Oct 2007 14:23 
Ben Bangert17 Oct 2007 16:32 
Graham Dumpleton17 Oct 2007 16:56 
Ian Bicking17 Oct 2007 16:58 
Marcin Kasperski18 Oct 2007 03:18 
Ben Bangert18 Oct 2007 11:34 
Philip Jenvey18 Oct 2007 12:22 
Ben Bangert18 Oct 2007 12:32 
Ian Bicking18 Oct 2007 12:37 
Marcin Kasperski19 Oct 2007 02:35 
Marcin Kasperski19 Oct 2007 02:42 
Graham Dumpleton21 Oct 2007 16:12 
Jon Rosebaugh21 Oct 2007 16:48 
Graham Dumpleton21 Oct 2007 18:20 
Marcin Kasperski22 Oct 2007 05:11 
Bob Ippolito22 Oct 2007 05:28 
Bob Ippolito22 Oct 2007 05:33 
Graham Dumpleton22 Oct 2007 16:23 
Graham Dumpleton22 Oct 2007 16:25 
Cliff Wells22 Oct 2007 17:52 
Graham Dumpleton22 Oct 2007 17:59 
Cliff Wells22 Oct 2007 18:53 
Graham Dumpleton22 Oct 2007 22:58 
Marcin Kasperski31 Oct 2007 08:39 
Ben Bangert31 Oct 2007 08:56 
Marcin Kasperski31 Oct 2007 09:38 
Marcin Kasperski31 Oct 2007 09:40 
Ben Bangert31 Oct 2007 12:08 
Graham Dumpleton31 Oct 2007 15:52 
Graham Dumpleton31 Oct 2007 16:44 
Ben Bangert31 Oct 2007 17:04 
Ben Bangert31 Oct 2007 17:08 
Marcin Kasperski05 Nov 2007 03:53 
Marcin Kasperski05 Nov 2007 04:08 
Marcin Kasperski05 Nov 2007 04:33 
Alberto Valverde05 Nov 2007 04:57 
Marcin Kasperski05 Nov 2007 06:03 
Ian Bicking05 Nov 2007 06:58 
Subject:Re: Reducing pylons app memory usage?
From:Graham Dumpleton (Grah@public.gmane.org)
Date:10/17/2007 04:56:00 PM
List:com.googlegroups.pylons-discuss

Ben Bangert wrote:

Apache is rather a pig on resources (the threading worker helps a bit), I'd suggest checking out nginx or lightty. nginx has a bit more modules to replace Apache functionality and has been more reliable in my experience, but both of them will mean dropping Apache (which uses at least 13 megs per process), and in exchange lighty/nginx take about 3-5 megs of ram. They're also significantly faster.

Base Apache memory use does not have to be excessive if you configure it correctly. The most important thing is not to load in Apache modules you do not need. If your Apache links a lot of modules statically, this may mean you need to rebuild it so it uses DSO modules for a lot of the core Apache modules. That way you can avoid loading them. Usually even this is not absolutely required.

Next thing is to avoid Apache prefork MPM and use worker MPM. Most problems with excessive memory use which people complain about are because they are using prefork MPM and as a consequence they need to run with a lot more Apache child processes. If you are running a Python application in an embedded mode within Apache child processes, this means so much more copies of the application and thus more overall memory use. By using worker MPM there are less Apache child processes and thus lower overall memory use. For a low volume site, also tweak the Apache configuration so you don't start up as many initial Apache child processes and reduce the maximum allowed number of processes.

Next problem can be if you are using mod_python and your OS supplies a crappy version of Python which doesn't have a shared library for the Python library, or doesn't put it in the correct place. The result on such systems is that the Python static library gets embedded within the mod_python.so file and often when it is loaded into Apache, requires address fixups for it then to be able to work. These fixups result in the memory going from being shared to local process memory. So, instead a of a shared library that is counted once across all processes, every Apache child process sees a 3MB+ hit to memory use. This lack of a shared library for Python will also see the same problem arising if mod_wsgi is used.

If all you want to do is host a WSGI application, then use of mod_python is also now becoming a poor option. This is because mod_python loads a lot of extra modules that aren't strictly needed in hosting a WSGI application, but relate to mod_python's own way of writing Python web applications. Also, mod_python loads some modules up front that should only really be loaded on demand if specific features of mod_python are required. Thus, using mod_python you can incur a couple extra MBs of memory use you do not need. A better option in this respect is mod_wsgi as it is targeted at WSGI applications and doesn't have these extra memory overheads and inefficiencies in memory use that mod_python does.

Next issue is Python web applications which gradually accrue/leak memory over time. Because they can do this, it is important to set the maximum number of requests that an Apache child process should handle before the process is recycled. Setting a limit ensures that the memory usage is brought back to the base level and any dead memory that can't be reclaimed for whatever reason is thrown away and you don't just have a process that just keeps on getting bigger and bigger over time until you run out of memory.

Further option is not to run your Python application embedded within the Apache child processes, but to run it in a separate daemon process using daemon mode of mod_wsgi, or use flup in conjunction with one of the fastcgi/scgi/ajp solutions for Apache. By doing this you can more closely control how many processes you want to run your web application in. So, you can specifically say you want only one daemon process used. You should still though ensure you configure the daemon process to be recycled after a maximum number of requests if you have a leaky web application.

If using daemon processes, still use Apache worker MPM. All the Apache child processes will then be doing is serving static files and proxying requests for Python application through to the daemon process.

Note that in general the argument that lighttpd and nginx are better because they are quicker in serving static files is totally meaningless in the context of Python web applications, as the real work is being done in the Python application and that along with any database access will be your bottleneck. Unless your web application is specifically being used as a mechanism for working with large numbers of very large distinct static media files, you will not really gain anything from using lighttpd or nginx as static files isn't the bottleneck. All you will possibly do is just make your setup and configuration more complicated than it needs to be.

In summary, configure it properly and select which Apache modules you use appropriately, and Apache doesn't need to be the bloated thing that people claim it is. Using Apache can also be simpler to maintain as configuration and process management can all be handled in the one place and you don't need to be running a separate supervisor and process control system for a distinct web server running just the Python application. In an environment where you aren't memory constrained and need high performance and scalability, Apache will in general also be a better choice due to its ability to create additional child processes to handle any extra temporary demand, such process then being killed off when no longer required. So learn how to use and configure Apache properly, and you should be okay.

Graham