This post was originally going to be something marveling at how StackExchange only has 25 servers, but could probably run on 5 as well as wondering why nobody else seems to be able to do that, but the more I thought about that, the less convinced I was in that premise. With all the advances in cloud-provided servers, I’m less and less convinced about the need for people to run their own servers exclusively in a physical datacenter.
There’s some business sense to running your own servers in a physical datacenter, but that generally only works out if you start looking at large instances on AWS, running pretty much constantly. That doesn’t seem to be the case for most people. In fact, most of the analyses I’ve seen on the matter seem to assume copying a company’s entire physical infrastructure over to IaaS. That misses the entire point of using cloud providers like Amazon or Google. The biggest advantage to services like Google Cloud Platform or Amazon Web Services is that you can set your application to run on a minimal of hardware, and you only add capacity on an as-needed basis. Physical data centers require you to have enough hardware to handle your heaviest loads at all times. A well-tuned application running on IaaS somewhere requires you to have just enough machines to handle the current load.
Managing software and services is already painful enough, why add dealing with hardware into the mix? With all the progress people have made with things like Chef, Puppet, and Ansible, you should be able to set up and deploy your entire stack via a script anyways, so what good does physical control of the server get you? At this point, physical control of the hardware is only going to get you anything if you have, and need, a lot of hardware engineering talent. And the companies that have the need for that kind of talent are generally renting out space on their servers.
Even if you do have a steady, heavy load that requires large, constantly-running servers, there’s no reason you shouldn’t be able and ready to fall back on a cloud-based infrastructure during higher than anticipated peak loads. It’s also a good fallback for instances like natural disasters where your physical datacenter is unavailable.
Regardless of whether you’re using physical or virtual hardware, we’ve pretty much reached a point where we’re operating with an utter disregard for servers. I’m not totally sure if this is a good thing or not. On the 1 hand, it’s great to not be limited by hardware, but on the other hand, we’re writing inefficient software and painting over our poor programming design and practice by throwing hardware at the issue. Then we’re put into a point where we have to build, maintain, and run huge server farms to save money.
Ideally, with the move towards IaaS, and a financial model that charges based on how much computing you’re actually doing, it would lead us to re-think how we’re designing our software to make it as efficient as possible, not just as fast as possible (because it’s easier and easier to spin up a big server). Sure, hardware is cheap, but good programming saves you more in the long run – money, time, and allows you to scale much more efficiently (out or up).
Here’s an example of what I’m talking about. I’m working on an application that from a user’s perspective will have about 4-5 components. Under the hood, it will end up having 8-10 components, each broken out and designed to be run on multiple machines/clusters. This means if any component sees heavy traffic or processing requirements, we can spin up a small (let’s be honest, it’ll probably be a micro) server to handle the extra requirements, and then kill it as soon as traffic dies down. Because of how the code’s been broken up, only the modules that need to scale do, and only for as long as we need the extra processing. That means when scheduled jobs need a little extra power, we can create a few more job machines without adding more hardware to endpoints or servers running other jobs. If an endpoint is getting hit hard, we can spin up servers to handle just that endpoint until traffic dies down.
Here’s the biggest problem with physical servers – sizing them to fit your needs is a slow and painful process. If you need to increase your hardware, you need to order the server, wait for it to arrive, install it, after which you move on configuration and setup. But, as a benefit of purchasing the server directly, you also get the benefits of having to keep the server cool (don’t get me wrong -a server closet is marginally useful in a building with no heat whatsoever during the winter), paying 100% of the costs for electricity and said cooling, and if you grow big enough, you get to move those machines to a whole new building, all without making your users suffer through downtime.
With tools like Netflix’s open source library and Amazon’s OpsWorks, automatically managing VMs is easier than ever, including eliminating unnecessary machines. With no comparable option for physical servers, it’s no wonder people think it’s expensive to switch over to PaaS solutions. If you really are a company that needs heavy processing power on a consistent basis, then yes, you should have a physical set of servers somewhere, but that should only be big enough to handle your typical day-to-day load. For everything else, like higher-than-normal and/or overflow loads or services with infrequent and/or inconsistent usage, just run it all on the cloud, with the smallest servers you can get away with. Tune your cluster to run on as few servers as absolutely needed, and then stop spending time managing instances.
Managing physical machines is hard, adds a lot of complexity to your projects, and ultimately is a hard constraint. Adding more capacity is a time-consuming and difficult process, compared to adding capacity in the cloud which is a mere few mouse clicks and keystrokes away. However, the ability to scale your server load down is almost as important as your ability to scale it up. It’s the key to controlling the costs of running your software, which means you make more money from it. Reducing the capacity of your physical infrastructure means either letting expensive servers sit around unused, or getting ride of them (best of luck when you need to scale back up if you choose to go that route).
Relying on a physical data center turns your hardware capacity into a constant, despite your server needs rarely, if ever (and let’s be honest, it’s closer to “never”), being constant. The 2 biggest mistakes people make in evaluating these systems is 1) assuming they’d be running the same capacity as their physical servers at all times 2) not taking advantage of the the ability to scale down. Scaling down is as much the point of running in the cloud as scaling up, seeing as how that’s where the cost savings come into play. At the end of the day, physical servers are a hard constraint, 1 that you should be minimizing, not embracing.