Originally, IT departments provided services to users by building and maintaining datacenters containing servers and other equipment. One of the problems with this model is that the servers are often underutilized. To accommodate the increased workload of the “busy season,” organizations often build datacenters with resources that far exceed their everyday needs. Therefore, some of those expensive resources remain idle most of the time. Virtual machines (VMs), such as those administrators can create using hypervisor products like Microsoft Hyper-V and VMware ESX, solve this problem. Virtual machines allow the consolidation of multiple logical servers into one physical computer. Administrators can scale VMs by adding or subtracting virtualized resources, such as memory and storage, or they can move the virtual machines from one physical computer to another, as needed.
Cloud providers use this same consolidation technique to provide their subscribers with virtual machines. For example, when a subscriber to Microsoft Azure creates a new server, what actually happens is that the Azure interface creates a new virtual machine on one of Microsoft’s physical servers. The subscriber has no administrative access to the underlying physical computer hosting the VM, nor does the subscriber even know where the computer is physically located. The virtual machines on the physical server are also isolated from each other so that two subscribers who are the fiercest of competitors might have VMs running on the same host computer, and they would never know it. The provider can—and probably does—move VMs from one host computer to another when necessary to efficiently utilize the servers’ physical resources, but this process is completely invisible to the subscribers.
The end result of this consolidation model is that each VM receives exactly the virtual hardware resources it needs at any particular time. Subscribers pay only for the virtualized resources the VMs are using. Nothing goes to waste.
Scalability
Business requirements change. They might increase or decrease over a matter of years and experience regular seasonal, monthly, weekly, or even daily activity cycles. A physical datacenter must be designed to support the peak activity level for the regular business cycles and also anticipate an expected degree of growth over several years. As mentioned earlier, this can mean purchasing more equipment than the business needs during most of its operational time, leaving that excess capacity often underused.
Cloud-based services avoid these periods of underutilization by being easily scalable. Because the hardware in a virtual machine is itself virtualized, an administrator can modify a VM’s resources through a simple configuration change. An on-premises (that is, noncloud) virtual machine is obviously limited by the physical hardware in the computer hosting it and the resources used by other VMs on the same host. In a cloud-based VM, however, these limitations do not apply. The physical hardware resources are invisible to the cloud subscriber, so if the resources the subscriber wants to add to a VM are unavailable on its current host computer, the provider can invisibly move the VM to another host with sufficient resources.
A cloud-based service is scalable in two ways, as shown in Figure 1-9:
- Vertical scaling Also known as scaling up, vertical scaling is the addition or subtraction of virtual hardware resources in a VM, such as memory, storage, or CPUs. The scaling process is simply adjusting the VM’s parameters in a remote interface; scaling patterns can even be automated to accommodate regular business cycles. Therefore, the subscriber pays only for the resources the VMs used at any given time.
- Horizontal scaling Also known as scaling out, horizontal scaling is the addition or subtraction of virtual machines to a cluster of servers running a particular application. For example, in the case of a cloud-based web server farm, incoming user requests can be shared among multiple VMs. Administrators can add or subtract VMs from the cluster if web traffic increases or decreases.
FIGURE 1-9 Vertical scaling enhances the existing server, whereas horizontal scaling adds more servers.
For example, suppose an organization’s website experiences a sudden increase in traffic that overwhelms the web servers. In that case, network administrators can handle the problem in two ways: either add resources to the existing servers, such as additional memory or storage, or add more servers.
In an on-premises datacenter, both options entail a degree of delay and downtime, which can affect the decision of which scaling to use. In a cloud-based environment, however, the decision is typically a matter of cost. Adding memory to a server or even adding another server are tasks completed in minutes, not days or weeks.