How far have websites come in 13 years?
At the risk of sounding old, the first proper website I worked on was www.hannants.co.uk, 13 years ago. It was an e-commerce site whose basic functions were to search for products, run a shopping cart and take payments. It's still going strong, which made me wonder how much had changed in the usage, technology and costs of the site in 13 years.
To illustrate how far back 13 years is: Google had only just got going, Facebook and Twitter were 5 years away, your mobile phone (if you had one) didn't have a web browser or apps and Apple was recovering from near bankruptcy and about to launch the iPod.
Hannants sold model kits to enthusiasts and with them now taking online ordering seriously the original Hannants site quickly converted people from ordering over the phone using a big paper catalogue, to the joys of being able to search and browse products over their dialup (or maybe 0.25Mb broadband) internet connections.
There was still suspicion about giving credit card numbers over the internet so there was the option of phoning it through or paying by post after ordering. The eMailing List was a popular way of learning of new products on the site since people still had to pay by the minute for internet access.
The current site still handles the same functions and is even very similar visually. The main difference from then and now is the amount of traffic it handles. The number of worldwide internet users has increased by around 10x since 2000, from 300 million to nearly 3 billion. The increase in traffic on Hannants is even more impressive; it has increased by around 20x since 2000.
Of course, modern servers have also increased in power, but are we using all that extra capacity serving more traffic?
By looking at this particular site we can estimate the extra power needed to run the site now and compare that to the actual power we have available, along with how much it costs.
|Year||Servers||Total RAM||Total BogoMIPS|
1 x Cobalt Raq 2
Shared with multiple sites
2 x AWS medium instance
2 x AWS large instance
CDN for static files
The whole site was running on a single MIPS-based Cobalt Raq 2 server which was also shared with other sites at the time(!). To put that into perspective, that server has about the same power as an iPhone 3G or a Raspberry Pi. The site was written in Perl with apache and running on a Linux 2.0 kernel.
The site is now running entirely on Linux Amazon AWS virtual servers running Ubuntu, most images are served from the Amazon CDN and the load is spread accross multiple servers. There are two web servers running the now PHP site and two database servers running MySQL which handles the bulk of the work of the site.
Doing some rough calculations based on one original server equalling 125 BogoMIPS (since the full 250 was shared with other sites) and assuming everything else affecting performance has scaled in line with BogoMIPS (which holds true for the memory), this means we have the power equivalent to 192 of the original servers! Couple this with the fact that lots of work on static files has been offloaded to the CDN means we can probably round it up to say we have 200x the power of the original setup.
20x the traffic, 200x the power
So why does a site which gets 20x the traffic have 200x the power? The expectations of the users of the site have changed and some of that power is being used to provide those users with a better experience.
The old site suffered badly from spiky traffic, it had a weekly newsletter which caused a massive spike in traffic to the site to check out the new products. The original site struggled to handle these loads but any problems were usually short lived and the customers didn't mind too much because most other sites were like that. Modern sites needs to be able to deal with spikes of traffic and offer the same level of service as normal (apart from the Glastonbury ticket site apparently). Accounting for spikes could mean keeping 50% extra capacity idle in reserve.
Since the modern site is run on AWS using virtual servers there is also the possibility of starting up extra servers during the spikes and shutting them down after, thereby reducing the total power that is needed all the time.
Page load times
Above the extra traffic it has to serve, the current site has much better page load times. The original site used flat files instead of a database which meant a search for a product which happened to be at the end of the file took a long time, up to 10 seconds for the result page to appear. The current site always returns pages in under 1 second, which is the most a user would expect to wait nowadays.
10x more power/memory would have reduced load time by 10x but switching to using a database with proper indexing meant a lot less power was needed for the same result. The extra power here was essentially free by using better software.
Features and catalogue size
The number of products on the site has increased, but not more than double. Features have been added to the site but these have not greatly increased the load on the server. The site is updated more often and there is now a backend component. We can guess that together these changes will add about 2.5x the work for the server.
Higher level software
The site now uses PHP with a more structured Object Oriented codebase to make the code easier to manage and features easier to add. This will add some extra overhead but a PHP bytecode compiler has also been introduced which means performance will be similar to the old Perl code.
Considering the above we can work out how much extra power is really needed to run the site today:
- 30x power for more traffic and handling spikes (20x traffic, 50% extra for spikes)
- 2x power to improve loading times (2x power to allow for running a database which gives us 10x better loading times)
- 2.5x power for extra features and catalogue size
That gives 150 of the original servers needed to run the site today vs the equivalent of 200 that are actually being used. That's a discrepancy of 50 (or 25%), small enough to be put down to a bit of unexpected overhead and my bad estimations.
So the site is using approximately the number of modern servers we would expect, but what about the cost of those servers?
The price of the original server and bandwidth was around $450 a month. Allowing for the server being shared, the cost of hosting the site was $225. The price of the required AWS instances now, along with bandwidth is around $500 a month.
Since the traffic (and I'm assuming, the sales) have increased by 20x, the cost of 1 old server's worth of traffic is now $25. Therefore, servers are now 9x cheaper per chunk of traffic. If the same amount per chunk of traffic as previously was being spent, 9x the power would be available even though it is doing more work for better loading times and handling spikes.
$1 today gives you servers with 9x the power you would have had 13 years ago, even while giving a more demanding modern user experience
What to do with 9x the power per $?
Good question. Should websites be doing 9x more advanced things, or is reduced prices through a 9x reduction in costs the benefit?
This is obviously a very rough estimate and is based on a single site where the user's demands have not changed much over time but it does make you wonder how other sites have changed in 13 years and how they are making use of their extra power.