Google Professional Data Engineer – Managed Instance Groups and Load Balancing part 1
- Managed and Unmanaged Instance Groups
In this section we’ll discuss managed instance groups and load balancing. Managed instance groups are basically a group of identical VMs that are generated using something called an instance template and this group of VMs can be scaled automatically to meet increasing traffic requirements. This is why managed instance groups and load balancing typically go together. Load balancing involves distributing requests across instances so that requests go to those instances which have the capacity to handle them. Managed instance groups are a pool of similar machines that can be scaled automatically. They are created from the same instance template and are identical in every way. Managed instance groups are typically used with load balancers.
The Google Cloud platform offers a number of different ways of load balancing. Load balancing can be external or internal that is open to external traffic or only serve traffic within the cloud platform. It can be global or regional. Https load balancing is a global external load balancer which deals with traffic from the internet. We’ll study this in a lot of detail and see the basic components which make up the Https load balancer the target proxy, the URL map, the back end service and back ends. We’ll also look at other load balancers that Google offers. We’ll see use cases and architecture diagrams for the Https load balancing, SSL proxy and TCP proxy load balancing, network and internal load balancing and at the end we’ll look at how auto scaling works.
Instance groups that is, a group of VMs on the Google Cloud platform can be of two types managed as well as unmanaged. We’ll start off by taking a close look at managed instance groups. Let’s define instance groups first. An instance group is basically any group of machines which have been created and are managed together. A real world project or a network typically contains a huge number of VMs. If you had to individually manage and configure them, that would be a major pain, which is why we have instance groups to manage these machines together. There are two kinds of instance groups that the Google Cloud platform supports. The first is the managed instance group, which is what we’ll study in detail here.
All the VMs that are present in this group are identical and have been created from the same image. The second is the unmanaged instance group. These are groups of dissimilar instances. The VMs are not identical, they haven’t been created from the same image. Managed instance groups tend to be far more interesting because they offer auto scaling and rolling updates. So let’s study that first in some detail. Managed instance groups have VMs which have been created using an instance template and this results in the creation of identical instances. The instance template points to an image which can be a custom image or a public image which is used to create identical VMs.
All VMs in this group can be managed together using this instance template. Changes to the instance group changes all instances in the group. An instance template defines a bunch of properties for a VM.
It defines the machine type, the image that is used to create the instance, the zone where the instance is going to live, and a number of other properties. An instance template is basically a way for you to save the instance configuration in order to use it later to create new instances or groups of instances. An instance template is a global resource. It’s not bound to a specific zone nor to a specific region. That means once you create an instance template you can use it to create VMs anywhere in the world. The instance template can reference zonal resources though, so you can specify that you want a persistent disk of a particular kind to be attached to your VM once it’s created.
A persistent disk is a zonal resource. Once you have the instance template which references a zonal resource such as a persistent disk, the instance template can then be used only within the zone. So if you try to instantiate a VM outside of that zone when it references a zoning resource, that VM creation will fail. An instance template is used by a managed instance group in order to create instances which are identical to each other. This is what allows a managed instance group to automatically scale the number of instances in the group when the traffic flowing to the group increases. Managed instance groups are typically used along with load balancing. Load balancing tries to distribute traffic across instances and managed instance groups can scale to handle increased load.
The instances within a managed instance groups are auto healing. If an instance stops, crashes or is deleted, the group automatically recreates the instance with the same template. So the number of instances within the group remains constant if you want it to be so. This ability of a managed instance group to identify and recreate unhealthy instances is called auto healing. There are two categories of managed instance groups. The first is a zonal instance groups. Here, all the VMs which belong to this managed instance group are part of the same zone. They are not spread across zones. Managed instance groups can also be regional. That means the VM instances which make up this group are spread across multiple zones.
Let’s compare zonal and regional managed instance groups and see how they stack up. And let’s see when you would prefer to use one over the other. You would prefer regional instance groups to zonal so that the application load can be spread across multiple zones. Two zones within a region are completely isolated from each other. They have no common points of failure, which means that your regional managed instance group is much more reliable. Regional instance groups protects against failures within a single zone. There can be times where you would want to choose a zonal managed instance group. This is if you want lower latency in communication between your instances and avoid cross zone communication.
That’s when you’ll prefer a Zonal. Managed Instance Group every managed instance group has a health check associated with it which periodically checks the instances to see whether they are receiving traffic and whether they are healthy. The health check is used to monitor instances within the group. If one or more instances within a group become unhealthy, that is, they stop responding to requests. It’s possible that a service has failed on those instances. These instances will be recreated when they are within a managed instance group. This process is called auto healing. The health check that we configure in a managed instance group is very similar to the health check that we’ll configure when we look at load balancing.
But the objective of the health check is different for an instance group. In a load balancer, health checks are used to determine how traffic needs to be distributed across the instances which are connected to the load balancer. LB health checks are used to determine where to send traffic. Traffic will be sent only to healthy instances. Health checks within a managed instance group are used to recreate instances when they become unhealthy. A group has a certain number of instances. If any of those instances become unhealthy, they are recreated with the same template. The health checks for the managed instance group and the load balancer are not replacements for each other.
In fact, both of these should be configured since they serve different purposes. Typically, we would configure the LD as well as the instance group. Health Checks the auto healing feature of instance groups results in a new instance being recreated based on the template that was used to originally create the instance. This template might be different from the default instance template that exists now. When an instance is recreated, any data that is present on disk might be lost unless it was explicitly snapshotted and backed up. When you configure a health check for a managed instance group there are a number of properties that you need to configure.
One of these is the check interval, that is the time that you need to wait between consecutive attempts to check instance health. You will also need to configure a timeout that is the length of time you need to wait for a response from the instance before declaring the check attempt failed. You will also configure a property called the health threshold. Typically, the health check works in this manner. It checks a number of times to see whether an instance is healthy. If the number of healthy responses from the instance is equal to the health threshold, it indicates that the VM is healthy and the VM is marked as such. Similar to the health threshold, there is also the unhealthy threshold.
This threshold determines how many consecutive failed responses indicate that the VM is unhealthy. The instance group that you typically prefer to configure is the managed instance group. This is what is recommended because of its auto scaling and automatic update properties. Unmanaged instance groups exist and you might find occasion to use them. So let’s understand how they are set up. Unmanaged instance groups are groups of dissimilar instances that you can add and remove from the group. The machine types need not be the same. The image that is configured on the VM need not be the same. These are just VMs that are a logical unit for some reason that you have.
But they need not be similar in any way because they are a disparate group of instances. Unmanaged instance groups cannot offer auto scaling. They do not offer rolling updates or instance templates. It’s typically not recommended that you use unmanaged instance groups. You’ll use them only when you need to. Apply load balancing to preexisting configurations. Load balancing can be used with unmanaged instance groups as well. If you have a preexisting configuration with disparate instances, you can apply load balancing to these users using unmanaged instance groups.
- Types of Load Balancing
Anytime you set up a product or an application that receives user requests or any other requests, basically traffic of any kind, a load balancer is a must. A load balancer allows you to distribute traffic across multiple instances so that your traffic is always directed towards an instance which has the capacity to give you a response. The load balancer that you see depicted on screen is an external load balancer because it receives traffic directly from users. You can also have internal load balancers which receives traffic from other instances on your network. The load balancer really takes advantage of using managed services on a cloud platform.
It provides load balancing and auto scaling for groups of instances so that the number of instances that are being used to serve your traffic grows. As your traffic grows, you can scale your application to support very heavy traffic. That is when you scale up or if you’re experiencing a light period. Your instances can be scaled down so that you don’t have to pay for resources that you don’t use. Because the load balancer is typically used in conjunction with a managed instance group, it has the ability to detect and remove unhealthy VMs. Healthy VMs are recreated and automatically re added to the load balancer. The load balancer will check the instances to which it is connected and route the traffic to the VM that is closest to the user.
This VM should have sufficient capacity to handle the incoming traffic and provide a response. The load balancer on the Google Cloud platform is a fully managed service. There is no physical device that you need to configure. It’s redundant and highly available. There are several different ways to categorize load balancers that are offered by the Google Cloud platform. Let’s look at this in a hierarchical form so we understand where the different kinds of load balancers sit and what their characteristics are. One categorization of load balancers is into. External and internal Load Balancers external load balancers deal with traffic from the internet. They receive traffic from the outside world. Internal load balancers only deal with traffic from within their network.
On the Google Cloud platform, external load balancers once again come in two flavors. They can be global load balancers which means the VM instances to which the load balancer distributes traffic can be across multiple regions. They can exist anywhere in the world. External load balancers can also be regional load balancers. The VM instances connected to the load balancer are within a single region.They may be spread across multiple zones in the region, but they are within one region. The only kind of internal load balancer that Google offers is a regional load balancer. Here the instances are within one region. Google does not offer a global internal load balancer. There are three different kinds of load balancers which are external global load balancers.
There is the TCP proxy, the SSL proxy and the Https load balancer. All of these deal with external traffic and they are global in that VM instances are spread across multiple regions. The one regional load balancer that deals with external Internet traffic is the network load balancer. If you pay close attention to the kinds of load balancer that Google offers http, https, SSL TCP network, these should sound very familiar and should immediately ring a bell with you. This is because these are all layers in the OSI network stack. The application layer corresponds to Http or Https traffic, the session layer corresponds to the SSL traffic, the transport layer to the TCP traffic, and the network layer to network traffic.
Let’s say you were in a situation where you are completely free to choose the kind of load balancer that you want to use. Then the general rule of thumb is the load balancer should be in the highest layer possible. That is, if you can have load balancing occur at the http https level, that is the one you should choose higher in the layer. The more information and intelligence the application has, the better the load balancing will be. You can imagine that the application layer encapsulates the information in the lower layers of the OSI network stat, which is why you would prefer application layer load balancing wherever possible. That was for load balancing.
Let’s take a look at the health checks that we can configure for the different kinds of load balancing. We start off by looking at Http and Https health checks. These are the highest fidelity health checks because they don’t just verify that the instances are healthy, they actually verify whether the web server on those instances are up and whether they are serving traffic. In case your traffic is not Http traffic but is encrypted via SSL, that’s when you would choose SSL health checks. You’ll go further down the layer. If your traffic is not Https or SSL traffic, then you would choose to configure a TCP health check. Once again, as in the case of load balancing, your health check should also prefer those at the highest level of the OSI layer, if possible.
- Overview of HTTP(S) Load Balancing
We’ve learned in earlier lectures that load balancers on the Google Cloud platform can be divided into a number of different categories. We start off by looking at the very first of these, which is an external global load balancer. That is the http https load balancer. This is the load balancer that you’ll typically configure for client requests or web traffic for your production application. It’s an external load balancer so it typically serves traffic which comes from the Internet, which comes from your customers, comes from your clients. It’s a global load balancer which means it can distribute traffic across instances which are located across regions across the world.
And if you think about load balancing as it maps to the OSI network layer stack, you can see http https is at the application layer. It is at the highest level. That means when it’s routing packets it has the most information possible about how and where the packet has to be routed. Https Load Balancing is the smartest of all load Balancers here is a basic block diagram of all the components which make up the http https load balancing on the Google Cloud platform. It’s a global external load balancing service. External traffic, traffic from the Internet and global because it distributes the traffic across instances in the world. Another phrase to use is that it’s a multiregion load balancer.
This application level load balancer is the smartest of all load balancers that Gcp offers. It has the most information by which it can make its load balancing decisions. It can distribute this traffic amongst groups of instances based on proximity to the user. That is, the load balancer will try to send traffic to those VM instances which are located closest to where the user comes from if the user is in Asia. And the load balancer will try to find a VM instance closest to Asia, maybe in Asia itself. This load balancer is also capable of distributing traffic based on the requested URL. So if the requested URL is for static content, it can send traffic to one group of instances.
If the requested URL is for, say, a home page of an ecommerce site, then it will send it to a different group of instances and maybe product pages of the ecommerce site are hosted in a different group of instances so it can split the traffic based on requested URL. This load balancer is also capable of distributing traffic. Taking into account both of these rules together, it will find the group of instances that respond to the requested URL and then find the VM amongst that group that is the closest to the user. We’ll start off by looking at an overview of how traffic flows in from the Internet and passes through the load balancer and is split across instances which are connected to it.
We’ll then look at each of these components in detail. Traffic which comes in from the Internet is first sent to a global forwarding rule. A forwarding rule is nothing but a rule that you can configure on your cloud platform which points to a proxy where the traffic should be directed when it’s received. For an Https load balancer, the global forwarding rule points to a target proxy which is an Http proxy, and it is this Http proxy which makes a decision as to where this particular traffic is going to be sent. The target proxy, in order to make the decision of where to direct the incoming traffic, uses something called a URL map.
The URL map is basically a mapping that you can configure which maps the URL host name and the path to specific back end services which are capable of responding to the requests. Traffic will be directed only to those instances that are capable of responding to requests for that URL. The back end service is split up into one or more backends and every back end is made up of one or more instance groups. These instance groups can be managed or unmanaged. Every back end is capable of responding to a particular kind of request, so the back end service will direct each request to the appropriate back end based on a number of different factors.
These factors are serving capacity how much capacity is available on the back end to service this request? Instance Health are they enough healthy instances in this back end? In order to service this request, and finally the proximity to the user, the back end service will try to direct the traffic to that zone that is the closest to the user. Every back end surface is configured with its own health check, which determines whether the instances are healthy enough to receive traffic from the load balancer. This can be an Http health check. Or an Https health check. In case of an Https health check, the request is encrypted. Every back end is made up of one or more instance groups and the distribution of requests across these instance groups can happen based on a number of factors.
The load balancer will check the capacity of the individual back ends and this capacity can be determined using either CPU utilization or requests per second per instance. CPU utilization is a way to measure the processing that occurs on these back ends. If the CPU utilization for a particular back end is very high, it means that it’s heavily loaded in terms of processing capability. The Https load balancer might choose to send the traffic to another backend. Request per second per instance is a measure of traffic that is flowing into that back end. If this is very high beyond a threshold, then the load balancer will try to distribute traffic to another back end. Back ends are made up of instance groups and these instance groups can be either managed instance groups or unmanaged instance groups.
Here is where it is advantageous to have managed instance groups. If you remember in earlier lectures we’d spoken about the fact that managed instance groups have the ability to scale. If the traffic that is directed to it increases, they can automatically recreate instances with the same instance template and basically expand to meet the incoming traffic. The auto scaler of the managed instance group has to be explicitly configured to scale as the traffic scales and this scaling is based on the parameters of utilizations or requests per second per instance. If this load balancing is for encrypted traffic that is, it is an Https load balancer then the target proxy needs to have a signed certificate in order to terminate the SSL connection.
So the external encrypted connection which comes into the load balancer is terminated at the target proxy and the target proxy reestablishes new connections with the back ends in order to serve traffic. The connections that the target proxy establishes with its back ends can once again be SSL or non SSL connections. If they are SSL connections, then the VM instances receiving this traffic should also have SSL certificates installed. All of the VM instances that receive traffic from the load balancers are grouped together into instance groups. These instance groups make up back ends. These back ends are serviced using the back end service which receives traffic from the target proxy based on the URL map.
These VM instances are part of a network and they are governed by the routes and the firewall rules that are configured for the network. So in order for the load balancer traffic to reach these VMs, we have to explicitly configure firewall rules to enable this. This is also true for the health check traffic to reach the VMs, firewall rules have to be configured both for the load balancer and the health checker to the VM instances that make up this load balancer. This is very important because health checks can fail if the firewall rules are not properly configured. The back end service also tries to establish session affinity for client connections.
Session affinity tries to ensure that requests which come in from the same client are always routed to the same VM instance. Session affinity can be established in two ways using the client IP address you can include the IP address, the port, the protocol to determine which instance the connection will be established with. Session affinity can also be established using a cookie. During the first request there is response will send a cookie to the client browser and then this cookie will be used in subsequent requests to route the traffic to the same VM instance.