Google Professional Data Engineer – VPCs and Interconnecting Networks part 3
- Routes
We’ve spoken about traffic moving through a network and how it requires routes to be configured on the network before packets can go from source to destination. We’ll now talk about routes. A route is nothing but a mapping. It’s a mapping of an IP range to a destination. Routes are what tell the Vpc network where to send packets which are designed for particular IP address. Routes basically allow a packet from its source to get to its destination. Let’s say traffic emanates from a particular virtual machine and it has some packets designed for some destination IP address. Where do you send the packets? Look up the route and see where it has to go. For example, you may say that all traffic that is Internet bound, which is not within the network goes to a proxy server first and the proxy server then passes it out to the wider world.
When a network is first created on the Google Cloud platform, there are two routes which are automatically added by default. The first default route that is set up basically applies to traffic that goes from within the Vpc to the external world. All traffic which is bound for the external Internet is sent to a specific destination which may be a particular VM instance on that network from where it is then routed to the external world. Traffic bound for the Internet is typically addressed using an external IP address. The second route that is added by default facilitates communication within a virtual private cloud. We’ve already spoken about how different instances on a VPC can speak to each other using internal IP addresses. This is facilitated by this second default route.
Instances on a VPC can send packets directly to each other using internal IP addresses. Routes are a necessary but not a sufficient condition in order to enable a packet to get to its destination. Just because a route exists to a destination does not mean that the packet will actually get there. The packet has to be explicitly allowed to go there and this is done using Firewall rules. Firewall rules have to be explicitly configured which enables enables packets to move through a network or to move outside a network on the Gcp. When you go to the page which allows you to create a network, that network will automatically set up the default route for Internet traffic. It will tell you what instance you have to send the traffic to so that it can go out into the external world.
It will also set up the second default route, which is one route for every subnet that is created. This route to a subnet basically ensures that traffic from the rest of the network can reach that particular subnet. When you actually configure a route, there are multiple components which make it up. The first is a user friendly name that is used to identify the route. The second is the network, the name of the network to which this route applies, routes are on a per network basis. Next, when you configure a route, you will specify the destination range, the destination IP addresses that this route applies to. Now, you might say that a particular route applies to all instances which are in the destination IP address range, or you might want to target specific instances.
This you can do by setting up instance tags. Remember, a route specifies a range of IP addresses as its destination. Let’s say there are five instances which fall within this range, but you want the route to apply to just one specific instance. You’ll set up a tag associated with that instance, and you can specify that tag within a route so that traffic is directed to just that VM. Every route also has a priority associated with it. This priority is used to break ties. So let’s say a traffic is bound somewhere and it finds multiple routes to that destination. The route with the highest priority is what will be chosen. In addition to all the properties that we just saw, a route has to specify one of the following it has to specify a Next hop instance which is a fully qualified URL for the instance to which the packet has to be sent in order to reach the destination IP range.
Instead of a Next hop instance, it can also specify the Next hop IP address or the Next hop network, which is a URL of the network, which is the next hop on the way to the destination, or the Next hop gateway or the Next hop VPN tunnel. In addition to the properties that we saw on the previous slide, only one of these Next hop properties need to be specified. Routes are specified in instance routing tables, so every route in a VPC might map to zero or more instances. So you have no idea how many instances actually lie in the destination IP address range of a route. Now, routes apply to a particular instance if the tag of the route and the tag of that instance match.
So if you want to be more specific in where your traffic is directed, you wanted to go to just one instance or a subset of instances. You’ll specify a tag associated with the route, and that tag has to match the instance tag for traffic to be delivered. If no tag is specified, then the route applies to all instances in a particular network. And all routes that you configure form something called a routes collection. Let’s visualize a network that holds multiple VMs. The network block diagram is as you can see on screen. Now, this network is capable of receiving traffic from the Internet. So any traffic which comes in from the Internet is bound for a particular destination, either to one VM or to multiple VMs within this network.
Now, imagine that this network has a massively scalable virtual router at the very center. This virtual router basically connects to every instance within this network and has knowledge of all instances and how to get there for this particular network. Traffic from the Internet first goes to this router to look up the route to its destination. As all virtual machines are connected to this massively scalable router which sits at the center of the network. This router can handle every packet and pass it on to the next hop. Now, the next hop might be any of the virtual machines that are directly connected to this router.
Configuring routes is useful because routes allow you to implement more advanced networking functions in your virtual machines, such as setting up many to one Nats network, address translations allow multiple hosts to be mapped to one public IP, and setting up routes is an essential prereq for this. Routes are also essential if you want to use transparent proxies. You know that proxies can be of two types. Transparent proxies basically receive traffic and forward them on without making any changes to the traffic. So if you want to have all traffic within your network routed to a single VM and then goes to the external Internet, that’s a transparent proxy. You need routes in order to enable this.
- Firewall Rules
For traffic to get from source to destination within your network, you need to have a route as well as firewall rules to enable movement of their packets. Firewall rules are meant to protect your virtual machine instances from unapproved connections. These connections refer to both inbound connections they are called ingress connections as well as outbound connections or eggress connections. Now, you can create firewall rules to either allow or deny specific connections which you want. And the way you identify connections that you want to allow or deny is based on a combination of IP address port as well as protocol. Let’s say that you have a VPC and you have multiple subnets configured within these.
These subnets and resources which exist on these subnets can communicate with each other via internal IP addresses. The DNS on this network knows the host names as well as the internal IP addresses and has the routes from one instance to every other instance in this network. The route alone is not sufficient, though. In order for a packet to reach its destination, you also need to configure firewall rules for packets to traverse this route. A network is a trusted resource. All resources on a network trust each other and communicate via internal IP addresses. But firewall rules exist even to enable communication between instances in the same network.
And equally importantly, if you want traffic from the Internet to access the VM instances on your network, you need to enable firewall rules for these as well. Firewall rules are made up of a bunch of different components. The first is the action. Is it an allow rule or a deny rule? Does it block traffic or allow traffic on a particular route? Next is the direction. Is it ingress or egress? Now, the direction of traffic is with reference to a particular VM instance. So if traffic is coming into a VM instance, that’s called ingress traffic. And if traffic is moving out of a VM instance, that’s called egress traffic. In the case of ingress traffic, firewall rules can specify the source IP addresses that can send traffic to a particular VM instance.
In the case of egress traffic, firewall rules can specify what destination IPS can be reached from a particular VM. You can also configure firewall rules which only allow traffic for a particular protocol. Let’s say an Https rule will only allow secure traffic. It will not allow TCP or other kinds of traffic. A rule for TCP will not allow Http or Https traffic. You can also configure firewall rules based on the port. If you want firewall rules to apply only to a subset of machines on a particular network, you can specify instance needs by default, all rules. All firewall rules are assigned to all instances. But you can assign certain rules to certain instances alone.
By specifying these instance names, you can restrict the instances to which a rule is applied by specifying an instance tag or a service account as well, it’s possible that you have two firewall rules set up which conflict with each other. You can have one rule which denies certain connections from all instances, but a second rule to allow connections from a certain subset of instances. Now, how would you have these rules live together? By assigning priorities to each of these rules. The rule with the higher priority is the one that applies. Gcp firewall rules are stateful in that if a connection is allowed, all subsequent traffic in that flow is also allowed in both directions.
So you can’t configure a rule which says allow traffic on a particular port in one direction but deny the return traffic. So if a connection is allowed, all the traffic in the flow is allowed in both directions. The default behavior of a Firewall Rule is that every rule is assigned to every instance in the network. Firewall rules are associated with a particular network, so every rule applies to all instances unless you have specific restrictions to a subset of instances. If you have a rule which says that you allow inbound connections for a particular protocol, then every instance in the network will receive connections for that protocol. Let’s say it’s http or https.
If you want to constrain your firewall Rule and have it applied to only a subset of instances within your network, let’s say all instances on a particular subnet or all instances which are running a particular service, you can do so using tags or Service accounts. Tags are associated with specific instances and you can have firewall Rules which say things like allow traffic from those instances which have the source tag back end. This is a allow firewall rule. It applies to ingress traffic. We only accept traffic from instances which have the back end tag. If you don’t want to use instance tags, you can also use Service Accounts in order to restrict where your firewall rules apply.
You can simply say deny traffic to all instances which are running as the service account. BLA at AppSpot gcp serviceaccount. com instances within this network which run as the Service account will not receive traffic. Now, between tags and Service Accounts, which would you use in order to restrict which instances the Firewall Rule applies to? Now both Service Accounts and Tags work equally well and they are used to control which instances of a network that a Firewall Rule refers to. A tag is simply a string that you use to identify a resource. So here in this diagram, only those VMs which have the tag tag receive traffic from the Internet are in exactly the same way. You could say only those resources which run as a particular service account.
Here it’s p one two three compute developer Gserviceaccount. com will receive traffic from the Internet. Both of these are valid firewall rules. They work equally well. When would you choose one over the other? All things being equal, you should always prefer Service Accounts to tags in order to restrict your firewall rules. This is because Service Accounts represents an actual identity. It’s the identity using which the applications on that instance runs. Service Accounts are not simply a grouping mechanism. They also refer to some kind of security permissions that that application has. Tags, though, are only meant to logically group resources, either for billing or for applying firewalls. Any instance on your network can have exactly one service account that it runs as.
However, the same instance can have any number of tags. I think the largest about 256 tags or something like that, because any instance can have a huge number of tags. When you use tags to restrict your firewall rules, it’s harder to figure out what instances the tag applies to. A service account is much more heavyweight and it’s much more secure because actually creating and assigning a service account to a particular instance is restricted by identity and access management permissions. You require permissions to start an instance with a particular service account and this permission has to be explicitly given by our project admin. Tags, on the other hand, can be changed by any user who can edit an instance.
So it’s much safer to use Service Accounts as opposed to tags in order to restrict your firewall rules. Service Accounts are much more heavy weight in that to actually change a service account, you need to stop and restart an instance. Changing tags, it’s a metadata update operation. It’s a much lighter weight operation given all the advantages that a service account has to offer. If you want to restrict your firewall rules, prefer Service Accounts to tags. Here are some more characteristics of firewall rules in a VPC network only iPV four addresses are supported. No iPV. Six addresses. Firewall rules are network specific. You can’t share firewall rules across networks, so when you set up a firewall rule, it applies to just one network.
The one exception here is when you have a shared Vpc, any firewall rules that you set up applies across the shared Vpc. A shared Vpc is a single network. After all, in any firewall rule, whether it’s an Allow or a Deny or an Ingress or an Egress rule, if you use tags to restrict instances to which the firewall rule applies, then you cannot use Service Accounts, or if you’re using Service Accounts, then you can’t use tags. Service Accounts and Tags their usage is mutually exclusive within a single rule. When you set up a network, there are some firewall rules that are implied. That is, you won’t find them explicitly listed on the firewall rules page. They nevertheless do apply. The first of these rules is a default allow egress rule.
Any traffic from within the network can go out into the Internet. This is allowed by default. This allows all egress connections. This firewall rule has a priority of 65535, so this has the lowest priority. Any other firewall rule that you create will have a higher priority than this. The second implied rule, which is not explicitly listed but does apply, is a default deny ingress rule. So your network is completely isolated from the outside world. When it’s first created. No traffic can enter your network, so deny all ingress connections. This rule has a priority of 65535. Once again, if you remember, we’ve spoken earlier of the default network that is set up automatically on your Google Cloud Platform project.
This default network basically is an auto mode network. That means it has a subnet for every region created automatically. In addition to these subnets, it also has some default firewall rules that is set up. The implied rules exist for all networks created within your Google Cloud Platform project. In addition to the implied rules, these are the additional rules that your default network has. The first rule basically allows all VM instances on the network to communicate with each other, no matter on what subnet they reside. This is the default Allow internal rule. This allows ingress network connections of any protocol and port between the VM instances on the network.
This is why when you set up your VMs, all of them can communicate with each other because they are on the default network and the default network has this firewall rule set up by default. The default network also comes with the default Allow Ssh rule. This allows ingress TCP connections from any source to any instance on the port 22. It basically says that Ssh connections are safe wherever you are. You can Ssh to specific instances in the network. This is why when you set up VMs, they go to the default network and you can always Ssh to them. It’s because of this firewall rule. Another rule is the default Allow Icmp rule. This allows ingress Icmp traffic from any source to any instance on the network.
Icmp stands for Internet Control Message Protocol, and it’s basically an error reporting protocol. Network devices like Routers use this protocol to generate error messages to be sent back to the source IP address when there are network problems. And finally, the last default firewall rule that’s set up for the default network is the default Allow RDP. This allows Ingress remote desktop protocol traffic on TCP to Port 3389. This allows you to connect to your instance using Remote Desktop. In this example, you can see that you have a GCP virtual network that is a VPC. There is a single VM within this Vpc, but you can extend it to a number of VMs as well. Notice how the Egress firewall rule applies to that particular VM, and that determines whether that VM can communicate with an external post.
If your network has more than one VM, you also require an Aggress firewall rule to determine whether these VM instances can talk to each other. These are set up by default within your default network. When you apply Egress firewall rules, you specify destinations Cider ranges to which these rules apply. You can also specify the protocol for the traffic and the ports on which the traffic can be sent. Now, you can choose to restrict or allow destinations with which your VM instance can initiate a connection by using specific tags or service accounts. So if it’s an allow rule, you’ll say you permit the matching egress connection. If it’s a deny rule, you will say for that particular VM instance, block matching egress connections. Egress connections will be allowed or denied based on destination Cider ranges.
The protocol with which the packets are being sent, the port to which we are sending, destinations which have specific tags or service accounts. It depends on how the firewall rule has been configured. Similarly, within a network, you can control the kind of sources that can send traffic to your VMs. This you do by specifying ingress firewall rules. Here, this ingress firewall rule determines whether your particular VM can receive traffic from an external host on the Internet. Ingress firewall rules can also be used to control whether your VM receives ingress traffic from another VM situated on the same network. You can configure your firewall rule to say that instances on this network can only receive traffic from certain source side arranges can only receive traffic for certain protocols such as Http or PCP, and on certain ports.
If you use source with specific instance tags or service accounts in your firewall specification, you can only receive traffic if they have originated from an instance which has had that particular source tag or runs under that particular service account. Once again, ingress rules can be allow or deny. So far, I’ve only spoken in terms of allow rules. So if everything matches the Cider range protocol port and the service tag, service accounts allow that particular connection. But you can also configure a deny rule which says if the traffic is from a source which is running under a particular service account, don’t allow it to reach this VM instance. Or if it’s of a particular protocol, don’t allow it to reach this VM instance.