Improve ROIs and diminish costs by up to 91% with GCP Spot VMs

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
GCP Spot VMs

Introduction

GCP spot instances offer low-cost spot VMs that are suited to run fault-tolerant workloads. They provide pricing stability with no more than once a month pricing changes that help optimize cloud bills and ensure guaranteed discounts of at least 60% when compared to demand instances.

Google spot instances offer the same options, machine types and performances as regular compute instances by calling unused/spare compute engine VMs, reducing on-demand costs significantly. Although this proves to be more economical, there is always the risk of pre-empting spot VMs if the compute capacity needs to be recovered and allocated to other VMs.  Therefore, there is no guarantee of readily available Compute Engine VM capacity at any given point in time.

These instances are most suitable for fault-tolerant applications that can withstand interruptions and batch processing jobs. It is so because of the temporary nature of these instances. For example, even if a VM stops in this scenario, the job will get slowed down but will not stop completely. It will complete the batch processing without putting extra load on existing VMs. Moreover, it will prevent paying full price for additional standard VMs.

Let us try to understand how it works.

Spot VMs – The next generation of Pre-emptible VMs

Compute Engine sends a pre-emption alert to the VM whenever it is required to perform other tasks. Here is the time when the spot VMs user can handle it by planning it accordingly and performing a clean-up action before the VM stops. If the Compute Engine stops the VM, it will be set to TERMINATED whereas if the Compute Engine deletes the VM, it will be set to DELETE state. Pre-emption can only happen when Spot VMs are in the RUNNING state.

If Compute Engine pre-empts spot VMs within a minute of they being created, you will not be billed for those VMs at all.

How Spot VMs are better than Preemptible VMs?

Preemptible VM (PVM) is the Google Compute Engine VM instance that can be purchased at a discounted price if the customer accepts that it will terminate after 24 hours. Preemptible VMs can only run for up to 24 hours at a time, but spot VMs don’t have any limit on maximum runtime.

Are spot Instances risky investments? – Decide between availability and cost savings

By now, we have all understood that nothing comes for free and although spot instances are cost-efficient, it has a risk associated with them w.r.t the availability of compute engine space and also, the interruptions that might come along with it. The available capacity of spot instances can depend on various factors like region, time, etc. which are constantly changing.

The availability of spot instances is based on the supply and demand concept and hence can lead to unpredictable behavior. For example, let’s say you plan to purchase the most popular instance type and there is a sudden increment in the demand for these instance types. It is just like the Black Friday sale.

Google Kubernetes Engine (GKE) constantly manages the Compute Engine Spot VMs. Spot VMs have no guarantee of their availability since GKE can reclaim one or even all the spot VMs at any time, without knowing when they can be availed back.

This determines that opting for cost-effective compute engine spot VMs often makes you compromise on resource availability, but the good news is that these resources are unavailable only for a short span of time.

GCP spot VM instances vis-a-vis AWS instances and Azure VMs

For instance, let us presume you are working on your laptop, and you get at least a 60% reduction on your electricity bill with the condition that electricity can be plugged off at any time for just 30 seconds. It is the same with GCP spot instances, which offer only short-term interruptions, and that too only when the compute engine needs those resources elsewhere. The duration of instances being unavailable is so short that you do not realize the disconnection or feel the need to find a replacement for your instances. Tech companies like Salesforce and Autodesk use spot instances.

Cloud Service Provider-wise comparison of Spot Instances/VMs:

 AWSAzureGCP
PricingVariable (based on demand)Fixed query pricingFixed
LimitationsOnly 20 spot instances per AWS regionsNo Region based supportIt can be stopped at any time due to system events
Pre-emption time2 minutes30 seconds30 seconds

Scenarios where spot instances work the best

  1. Batch processing jobs
  2. Containers and Microservices
  3. CI/CD operations
  4. Orchestrated environment
  5. Distributed databases
  6. High-performance computing

Limitations

Compute Engine can pre-empt spot VMs at any point in time due to requirements in other tasks. Although the probability that the compute engine stops spot VMs for other tasks is generally low but consider the following factors while using spot instances:

  1. These instances should not be used if your system can’t afford even a few seconds of interruptions.
  2. These interruptions can vary from day to day and region to region.
  3. These are limited compute engine resources, so it is a possibility that they are not always available.

Better ROI or not – How pricing works?

Spot instances are not used instantly or in an ad-hoc manner. They are investments, which begin with sending a request for a spot instance by participating in a bid to set the maximum price that you are willing to pay. Now, if the highest amount set by you is greater than spot pricing and GCP has the free capacity, you can start using the spot instance immediately. In case, the highest amount set by you is still lesser than spot pricing, then it will not work for you.

It is similar to gambling where to get the best probability of bagging a spot instance, you should ideally be setting a price equal to the on-demand price to ensure zero interruptions because if you set a lower amount than spot pricing, you have a risk of workloads getting interrupted. As mentioned, these investments come with the risk associated with the unavailability of resources when they might be required. Having said that, it can reduce the cost from 60 to 91% when compared to on-demand instances.

Whether a spot instance translates into a better ROI or not is really defined by the workload which needs to run on those VMs. If the workloads can tolerate interruptions, unpredictability, and unavailability, then it is definitely a win. In case of the afore-mentioned scenarios like batch processing jobs and CI/CD operations, spot instances can really save cost as compared to other available options. But if the workload cannot tolerate any of the downsides, it will definitely have a direct impact on business and hence a loss in revenue.

Best Practices to use spot instances

  1. To keep a plan B handy in case of unavailability of spot VMs, ensure that you use a combination of spot VMs and standard compute engine VMs.
  2. Since the IP addresses used by spot VMs might change after recreation, the node names should also get changed.
  3. Node tolerations should be used to make sure that the node pools using spot VMs don’t get allocated to critical pods.

Conclusion

GCP spot VMs are a great way to reduce cloud billing but can be challenging to manage at the same time. In this blog, we saw how they are an effective means to save cost but they come with limitations of their own. There are several tools like CloudEnsure that currently offer recommendations on EC2 spot instances for different cloud service providers. The tool tells you exactly which instances the user should purchase to save the most on cloud billing for multiple resources based on regions after analyzing the utilization trend of a cloud account. 

Spot VMs are definitely a lucrative investment, that may or may not offer expected levels of monetary returns depending on the risk you are willing to take that comes along with it.

Share on twitter
Share on linkedin
Share on facebook
Share on whatsapp

Leave a Comment

Your email address will not be published. Required fields are marked *