<?xml version="1.0" encoding="UTF-8"?>
<article>
  <content>I'm by no means a &quot;scaling expert&quot;, but I've recently been doing a lot of research into scaling web applications and the best ways to go about it.

What you see below is a rough guide based on what I've learnt - it's not scaling gospel or even an accurate or thorough discussion of all different techniques or schools of thought that can be applied to scaling.

So, what does this series of posts cover?

* What scaling is;
* When/who you should think about scaling;
* The web server (two parts: the box and the software);
* The static assets;
* The database;
* The Rails code; and
* Caching

## What scaling is ##

&gt; In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in a commercial context, where scalability of a company implies that the underlying business model offers the potential for economic growth within the company.
&lt;cite&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Scaling&quot;&gt;http://en.wikipedia.org/wiki/Scaling&lt;/a&gt;&lt;/cite&gt;

## When to scale ##

Generally, the consensus seems to be that if your a start-up, fresh out of the oven, scaling is a waste of your time. Get some users first and build your product!

Scaling is generally something you consider when your current server setup is getting hammered and the growth of your applications warrants taking things to the next level.

## The web server ##

### The box ###

If you're going to be doing high traffic sites, you'd be stupid (in my opinion) not to get your own hardware - the cost of entry for some relatively good gear isn't that high and in the long run, the flexibility and power that comes with having your own hardware outweighs the hassle.

I'm running a Dell PowerEdge 1850 which is housed in a datacentre here in Perth (which makes sense for what I'm hosting on it) - it's not new, but it doesn't need to be - the only reason it was being disposed of by the company I bought it off was because the three year warranty had run out - during the three years however, it hadn't failed once and had been the primary domain controller for the organisation in question.

Something to consider carefully is location and if your site is targeted at an international audience, then you'll certainly want to pick a location to keep that new server of yours in a DC where there's good international pipes available (Perth isn't really blessed in this particular area).

When picking a datacentre, look for something with n+1 redundancy, diverse entry points for both data carriers and power (preferably off completely different feeds) and good security - taking a look at what carriers provide feeds into the facility is also a must.

### The software ###

For the operating system, I'm a fan of Ubuntu server edition - I'll admit, I've not tried much else (FreeBSD and Debian) and I always come back to Ubuntu - I think that's mainly because my first experiences with Linux were on an Ubuntu box. I keep the install down to the bear minimums - build_essential, iptables, fail2ban, denyhosts, mysql, nginx and a local postfix instance are all that are running (there's probably some other things, but you get the idea - no need for fruit)

Personally, I use the nginx (/engine-x/) web server - it's fast at serving static assets and it knows when to get out of the way if I want to pass things off to my Rails app. High profile websites including Wordpress.com and Github are using nginx with great success when it comes to high traffic scenarios.

Another part of my setup is Unicorn. Unicorn is written to make use of a number of existing system elements (threads, balancing, etc) which makes it stupidly efficient. For more, see this article by &lt;a href=&quot;http://tomayko.com&quot;&gt;Ryan Tomayko&lt;/a&gt; called &quot;&lt;a href='http://tomayko.com/writings/unicorn-is-unix'&gt;I like Unicorn because it's Unix&lt;/a&gt;&quot;.

Here's my config file:

@@@ nginx
upstream unicorn {
      server unix:/var/www/example/current/tmp/sockets/unicorn.sock;
} 

server {
  listen 80;
  server_name  example.org www.example.org;

  access_log  /var/log/nginx/example.access.log;

  location / {
        root /var/www/example/current/public/;
        if (-f $request_filename) {
          expires 60h;
          break;        # Static asset
        } if (-f $document_root/system/maintenance.html) {
          return 503;   # Temporarily unavailable
        } if (!-f $document_root/system/maintenance.html) {
         # error_page 500 501 502 503 504 /500.html;
          proxy_pass              http://unicorn;
        }

        proxy_set_header        Host $host;       
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        client_max_body_size    10m;
        client_body_buffer_size 128k;
        proxy_connect_timeout   90;
        proxy_send_timeout      90;
        proxy_read_timeout      90;
        proxy_buffer_size       16k;
        proxy_buffers           32 16k;
        proxy_busy_buffers_size 64k;
  }

  # output compression saves bandwidth 
  gzip              on;
  gzip_http_version 1.0;
  gzip_comp_level   2;
  gzip_proxied      any;
  gzip_types        text/plain text/html text/javascript text/css text/xml application/x-javascript application/atom+xml;
}
@@@

I also use Passenger from the gang over at Phusion - it's fantastic for getting things going straight away and doesn't require too much hassle - it's what I use on my development box and for smaller sites where I don't need to fiddle with things in the same way.

That's a long enough post, hopefully I'll get the next section on static assets and the database up soon.

It's important to remember that this is just my opinion, based on my experiences to date (which are somewhat limited given my age) - take everything with a grain of salt and always do your research before diving into something this big.</content>
  <created-at type="datetime">2009-12-09T08:32:35Z</created-at>
  <id type="integer">2</id>
  <parsed>&lt;p&gt;I'm by no means a &quot;scaling expert&quot;, but I've recently been doing a lot of research into scaling web applications and the best ways to go about it.&lt;/p&gt;&lt;p&gt;What you see below is a rough guide based on what I've learnt - it's not scaling gospel or even an accurate or thorough discussion of all different techniques or schools of thought that can be applied to scaling.&lt;/p&gt;&lt;p&gt;So, what does this series of posts cover?&lt;/p&gt;&lt;ul&gt;
&lt;li&gt;What scaling is;&lt;/li&gt;
&lt;li&gt;When/who you should think about scaling;&lt;/li&gt;
&lt;li&gt;The web server (two parts: the box and the software);&lt;/li&gt;
&lt;li&gt;The static assets;&lt;/li&gt;
&lt;li&gt;The database;&lt;/li&gt;
&lt;li&gt;The Rails code; and&lt;/li&gt;
&lt;li&gt;Caching&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;What scaling is&lt;/h2&gt;&lt;blockquote&gt;&lt;p&gt;In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in a commercial context, where scalability of a company implies that the underlying business model offers the potential for economic growth within the company.
&lt;cite&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Scaling&quot;&gt;http://en.wikipedia.org/wiki/Scaling&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;h2&gt;When to scale&lt;/h2&gt;&lt;p&gt;Generally, the consensus seems to be that if your a start-up, fresh out of the oven, scaling is a waste of your time. Get some users first and build your product!&lt;/p&gt;&lt;p&gt;Scaling is generally something you consider when your current server setup is getting hammered and the growth of your applications warrants taking things to the next level.&lt;/p&gt;&lt;h2&gt;The web server&lt;/h2&gt;&lt;h3&gt;The box&lt;/h3&gt;&lt;p&gt;If you're going to be doing high traffic sites, you'd be stupid (in my opinion) not to get your own hardware - the cost of entry for some relatively good gear isn't that high and in the long run, the flexibility and power that comes with having your own hardware outweighs the hassle.&lt;/p&gt;&lt;p&gt;I'm running a Dell PowerEdge 1850 which is housed in a datacentre here in Perth (which makes sense for what I'm hosting on it) - it's not new, but it doesn't need to be - the only reason it was being disposed of by the company I bought it off was because the three year warranty had run out - during the three years however, it hadn't failed once and had been the primary domain controller for the organisation in question.&lt;/p&gt;&lt;p&gt;Something to consider carefully is location and if your site is targeted at an international audience, then you'll certainly want to pick a location to keep that new server of yours in a DC where there's good international pipes available (Perth isn't really blessed in this particular area).&lt;/p&gt;&lt;p&gt;When picking a datacentre, look for something with n+1 redundancy, diverse entry points for both data carriers and power (preferably off completely different feeds) and good security - taking a look at what carriers provide feeds into the facility is also a must.&lt;/p&gt;&lt;h3&gt;The software&lt;/h3&gt;&lt;p&gt;For the operating system, I'm a fan of Ubuntu server edition - I'll admit, I've not tried much else (FreeBSD and Debian) and I always come back to Ubuntu - I think that's mainly because my first experiences with Linux were on an Ubuntu box. I keep the install down to the bear minimums - build_essential, iptables, fail2ban, denyhosts, mysql, nginx and a local postfix instance are all that are running (there's probably some other things, but you get the idea - no need for fruit)&lt;/p&gt;&lt;p&gt;Personally, I use the nginx (/engine-x/) web server - it's fast at serving static assets and it knows when to get out of the way if I want to pass things off to my Rails app. High profile websites including Wordpress.com and Github are using nginx with great success when it comes to high traffic scenarios.&lt;/p&gt;&lt;p&gt;Another part of my setup is Unicorn. Unicorn is written to make use of a number of existing system elements (threads, balancing, etc) which makes it stupidly efficient. For more, see this article by &lt;a href=&quot;http://tomayko.com&quot;&gt;Ryan Tomayko&lt;/a&gt; called &quot;&lt;a href=&quot;http://tomayko.com/writings/unicorn-is-unix&quot;&gt;I like Unicorn because it's Unix&lt;/a&gt;&quot;.&lt;/p&gt;&lt;p&gt;Here's my config file:&lt;/p&gt;&lt;div class=&quot;highlight nginx code highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;upstream&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;unicorn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;unix:/var/www/example/current/tmp/sockets/unicorn.sock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;listen&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;80&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;server_name&lt;/span&gt;  &lt;span class=&quot;s&quot;&gt;example.org&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;www.example.org&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;access_log&lt;/span&gt;  &lt;span class=&quot;s&quot;&gt;/var/log/nginx/example.access.log&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;location&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/var/www/example/current/public/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;(-f&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$request_filename&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;kn&quot;&gt;expires&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;60h&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
      &lt;span class=&quot;kn&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;# Static asset&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;(-f&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$document_root/system/maintenance.html&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;kn&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;503&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;   &lt;span class=&quot;c1&quot;&gt;# Temporarily unavailable&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;(!-f&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$document_root/system/maintenance.html&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
     &lt;span class=&quot;c1&quot;&gt;# error_page 500 501 502 503 504 /500.html;&lt;/span&gt;
      &lt;span class=&quot;kn&quot;&gt;proxy_pass&lt;/span&gt;              &lt;span class=&quot;s&quot;&gt;http://unicorn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_set_header&lt;/span&gt;        &lt;span class=&quot;s&quot;&gt;Host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;       
    &lt;span class=&quot;kn&quot;&gt;proxy_set_header&lt;/span&gt;        &lt;span class=&quot;s&quot;&gt;X-Real-IP&lt;/span&gt;       &lt;span class=&quot;nv&quot;&gt;$remote_addr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_set_header&lt;/span&gt;        &lt;span class=&quot;s&quot;&gt;X-Forwarded-For&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;client_max_body_size&lt;/span&gt;    &lt;span class=&quot;mi&quot;&gt;10m&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;client_body_buffer_size&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;128k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_connect_timeout&lt;/span&gt;   &lt;span class=&quot;mi&quot;&gt;90&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_send_timeout&lt;/span&gt;      &lt;span class=&quot;mi&quot;&gt;90&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_read_timeout&lt;/span&gt;      &lt;span class=&quot;mi&quot;&gt;90&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_buffer_size&lt;/span&gt;       &lt;span class=&quot;mi&quot;&gt;16k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_buffers&lt;/span&gt;           &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_busy_buffers_size&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# output compression saves bandwidth&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;gzip&lt;/span&gt;              &lt;span class=&quot;no&quot;&gt;on&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;gzip_http_version&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;gzip_comp_level&lt;/span&gt;   &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;gzip_proxied&lt;/span&gt;      &lt;span class=&quot;s&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;gzip_types&lt;/span&gt;        &lt;span class=&quot;s&quot;&gt;text/plain&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;text/html&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;text/javascript&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;text/css&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;text/xml&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;application/x-javascript&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;application/atom+xml&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I also use Passenger from the gang over at Phusion - it's fantastic for getting things going straight away and doesn't require too much hassle - it's what I use on my development box and for smaller sites where I don't need to fiddle with things in the same way.&lt;/p&gt;&lt;p&gt;That's a long enough post, hopefully I'll get the next section on static assets and the database up soon.&lt;/p&gt;&lt;p&gt;It's important to remember that this is just my opinion, based on my experiences to date (which are somewhat limited given my age) - take everything with a grain of salt and always do your research before diving into something this big.&lt;/p&gt;</parsed>
  <slug>scaling-a-rails-application-part-1</slug>
  <title>Scaling a Rails application - Part 1</title>
  <updated-at type="datetime">2009-12-12T07:49:14Z</updated-at>
</article>
