I Want to Go Fast: Speeding Up the Server Side

One of my websites has started getting a fair amount of traffic (approaching 1 Million pageviews a month), and the load is high enough that I decided it needed it’s own server. This post shows how I scaled the server/website to handle even Digg.

The New Server:

2x Xeon 3.2 Ghz Dual Core (4 cores total)
2 GB Memory
2x 73GB SCSI Drives

Software:

Linux: CentOS 5.0
Apache: 2.2
MySQL: 4.3 (on a separate server)
PHP: 5.1.6
Webpage: CakePHP Framework 1.2.2.8120

Website workload:

95% Read / 5% Write

Benchmark:

Apache Bench: ab -c 20 -n 1000 http://webpage

Base: 8.68 Pageviews/Second

Admittedly the webpage is heavy. I took a stock apache config, and on my first run I got 8.68 pageviews/second.  I was pretty happy with this, however my CPU utilization was 100%.  While this is normally a good thing, I knew that the CakePHP framework was eating up a lot of CPU cycles that were unecessary.  My next step was to fix CakePHP up a little.

CakePHP CacheTime Patch: 9.40 Pageviews/Second

There is a small performance bug in the code where it processes an entire page for caching even though the page won’t be cached.  Here’s the patch.

Even though this patch reduced the code path, we were still doing a lot of processing of code. While PHP is relatively light, it is still an interpreted language. There’s always a way of speeding an interpreted language up. In PHP’s case, it’s turning on APC.

Turn on APC: 18.35 Pageviews/Second

APC caches PHP byte code so it doesn’t need to get interpreted every single run.  This nearly doubles performance and drops the CPU utilization down to 85%.  Which means we have a new bottleneck.  Originally I though it was disk IO, but iostat shows little disk activity.  And since I’m running a local apache bench, it’s not network.  This leaves the database.  Even though the database is running on another server and that server CPU is only 30% utilized, there is latency involved in going to another server.  I figure it’s time to fire up memcached and point CakePHP at it (this proves to be a mistake).

CakePHP using memcached: 17.20 Pageviews/Second

I actually lose performance running cake’s caching through memcached.  After some investigation, I see that it only uses it to store paths, and realize that when using the default file cache backing it’s still sitting in memory and a file open/read is much quicker than making a connection to memcached and pulling the information.  I turn memcached back off for Cake’s caching.

Going back to the drawing board, I turn on database profiling and see that the session management is taking 500 milliseconds! (I had been using the database for storing sessions)  After some investigation, I find that CakePHP has an undocumented way of storing sessions in it’s cache. This time I decide to use the APC cache instead of memcached (using APC for Cake’s internal path/model caching makes no performance impact).  I can get away with APC for session caching since I only have 1 webserver (will have to go to memcached if I ever have more than 1 webserver).

Session Caching in APC: 20.55

The session caching did improve performance.  And it also improves my CPU utilization to 95%.  Still no disk IO.  Looking at the database profiles, I see a few queries which are a bit heavy.  I decide to integrate memcached into my CakePHP models.  Based on some code I found here, I reworked to be able to use in models as this is a more appropriate place than the controller.

Memcache for heavy calls: 21.55 Pageviews/Second

This helped some, and brought my CPU utilization to just over 99%.  While, you would think that this is good enough as it would be able to handle a front page Digg a few times over, I didn’t want to take any chances.  I needed to reduce the code path to get pageviews up.  That’s when I decided to code up “Lock Down Mode”. 

If the server gets hit hard enough causing the load to go above 10, the code will switch over to “Lock Down Mode”.  This reduces much of the site functionality, and makes most pages static instead of dynamic.  It will be good enough for 99% of the users and keep the site up.

Lock Down Mode: 108.70 Pageviews/Second

This seemed to do the trick. That should be enough to handle Digg, Slashdot, StumbleUpon and Reddit all at once.

Conclusion

While my work isn’t done, this is good enough for now. Going from 8.68 pageviews/second to 108 pageviews/second is not too bad. The secret to scaling a website is looking at what the current bottleneck and the figuring out how to address it.

Maybe one of these days I’ll write a post on speeding up the client side and moving to multiple servers.

Disclaimer: Yes, I know that this is not a truly valid test as I ran Apache Bench locally which was taking up CPU resources as well as not hitting the network path. But, it was good enough to show relative performance improvements. Additionally, there were numerous small tweaks along the way that I didn’t mention and many webpage specific performance improvements. Before all the website improvements, the “base” number was probably closer to 5 pageviews a second.

Nerd Stuff

4 Responses

  1. cuban Says:

    Love CakePHP, but always been curious about performance. Thanks for this.

  2. ktb Says:

    Interesting.

    Did you make a test with gzip and some other client side optimisations turned on/off?

  3. solo Says:

    hi
    i get very high httpd usage(around 30%) even for a blank cake page(the default one), what could be the problem?
    and what do u mean “lock down mode” is it for apache? mysql? or options in cake?

    thanks

  4. Jake Moilanen Says:

    ktb: I do have gzipping enabled…however, since apache bench doesn’t pull images/css/js files, I didn’t bother adding it into the mix.

    One of these days I’m going to do the client side optimisation benchmarks…

Leave a Comment

Your comment

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.