Google Associate Cloud Engineer – Block and File Storage in Google Cloud Platform – GCP part 1

Step 01 – Exploring Block and File Storage in GCP

Welcome back. Welcome to this new section on storage. In this section, we’ll be looking at the different storage options that are present in Google Cloud. We’ll be talking about block storage, file storage, and object storage options in this section. I’m sure you’re all excited. Let’s get started right away. Let’s get started with block storage and file storage. What is the difference between block storage and file storage? Think about this. What is the type of storage of your hard disk? With all the laptops or your computers? You have a hard disk attached. What is the type of the storage of your hard disk? The type of the storage of your hard disk is a block storage. Now, let’s say you have created a file share. You want to share a set of files with your colleagues. What is the type of storage you are making use of? You are making use of file storage.

Now, what are the differences between them? Let’s get started with block storage. A good use case for block storage is hard disk attached to your computer. Typically, one block storage device can be attached with one virtual server or one virtual machine. So your blog storage device, a hard disk or something of that, can typically at a time you’d be able to attach it with one virtual server. However, there are a few exceptions. You can attach read only block devices with multiple virtual servers. So if you have a read only block device, then you can actually attach them with multiple virtual servers that can read from it. And one of the recent developments is certain cloud providers are also exploring the multi writer disks as well.

So, in summary, in the traditional sense, one block storage device can be attached with one virtual machine. Important thing to remember is you can connect multiple different block storage devices to one virtual server. So virtual server A has one attached. Virtual server B has two attached. You can have a number of hard disks attached with your laptop or your computer. Similar to that, you can have a number of block storage devices attached with your virtual servers or virtual machines. These block storage devices can be used either as direct attached storage. This is similar to your hildisk. Or you can also use them on a network, something like a storage area network. So you have a high speed network connecting a pool of storage devices. Things like San are made use of by high performance databases like Oracle, Microsoft, SQL Server.

These make use of the San. Now, let’s talk about file storage. For example, a media workflow might need huge shale storage for supporting processes like video editing. So you have a number of people working on these files and you’d want to put them in a shared storage enterprise. Users need a quick way of sharing files in a secure and organized way. That’s when we go for file storage. These file shapes are shaded by several virtual servers. You can see that we have a file storage device in here. You can see that each of these virtual servers or the virtual machines are connected to the specific file storage devices. You have two virtual servers connected to this and two virtual servers connected to this. Now we understood what is block storage and File storage in general.

Let’s now get specific to Google Cloud platform. What are the options that are present in Google Cloud platform for block storage? The first option is persistent disks. This is block storage on the network. So any virtual machine that we would create is on a specific host. And you can connect this virtual machine to block storage which is present somewhere else on the network. That’s the concept of a persistent disk. There are two types of persistent disks that you can create zonal or regional. The difference between these is in how the data from a persistent disk is replicated across zones. With Zonal, the data is replicated only in one zone. However, with regional, data is replicated in multiple zones. If you want high durability for your data, you’d go for regional persistent disks. The other type of block storage devices are local SSDs.

These are present on the same host as your virtual machine. So these are local. These are local to the host where your virtual machine is present. Now let’s get to file storage. The file storage option in GCP is file store. It is a high performance file storage service. Let’s quickly see where you can actually look at these options. So when we create a virtual machine, that’s when you can attach block storage devices with it. You can either attach persistent disks or local SSDs. Where can you attach them? The place where we can attach them is, number one, a boot disk. This boot disk is the place from which your operating system is loaded. And you can see that a ten GB balanced persistent disk is being attached with your VM. So this is the first block storage device which is attached by default with any virtual machine.

This is a boot disk. And over here, what we are doing is we are using a persistent disk as a boot disk. Now, other than that, if it won’t actually manage the disks which are attached, you can go to Management, Security, disk Networking and Soltency. And this is where you can go in and attach hard disks with your virtual machine. So you can go in here and go over to disks over here if you’d want any additional disks. So you can say add new disk and say, I would want a new persistent disk of 500 GB attached. And over here you can choose what is the type of the persistent disk. You can choose what is the schedule in which you have to take backups. You can also configure source. Should it be blank? Should you copy from an image? Or do you want to copy from a backup. We’ll talk all the options in depth a little later.

For now, the important thing to understand is that this is where you can add in new disks, new persistent disks with your virtual machine. Now, let’s say you want to add local SSDs. Where can you configure local SSDs? The local SSD can also be added in a similar way. So you can say Add new disk, and over here you can scroll down and choose the right type. We will talk about the different type of persistent disks a little later. For now, the important thing is you can also choose local SSD scratch disk. So this is where you can attach a local SSD scratch disk. Not all machine types support local SSDs. But the important thing that you need to remember is that if you want to add local SSDs, this is where you can do that. Now, where can you actually create file stores? So I can just type in file store. You can go in here, file Store. This would open up the Cloud file store API page.

Before we would be able to use a File Store, you need to enable the API. So what we can do is go here and say Enable Cloud. File store is not really part of the Compute engine. The block storage devices are actually part of the Compute Engine APIs. However, File Store APIs are actually different and that’s why we need to first enable them. So what we are doing now is to enable the File Store APIs. As you can see in here, the File Store API is used to create and manage cloud file servers. Once the API is enabled, you can go in and create a new instance of FileStore. As it says in here, an instance is a fully managed network attached storage system you can use with your Google Compute Engine or the Kubernetes Engine instances.

So you can connect from your file store from Compute Engine and Kubernetes Engine instances as well. If you want, you can actually go in and say Create instance, I’m not going to create a File Store right now. But you can see what is involved in creating a file. So you just specify the name for the File Store. You can specify whether you’d want high performance one or the basic one. And you can also choose the storage type, whether you want to use a HDD, which is a hard disk drive, or a SSD solid state drive. If you want high performance, you’d obviously go for SSD. You can also configure how much capacity you want on your file share as well. You can choose where you’d want to store your data, which region, and which zone.

Once you create a File Store, you need to SSH into your VM instances and attach the File Store to a specific VM instance. In this step, we got introduced to the different block storage and file storage services in GCP. There are two important block storage options persistent disks and local SSDs. Persistent disks are network block storage. Local disks are local block. Storage they are on the same host as the virtual machine if you’d want high performance, you’d go for local SSDs if you don’t want high durability, you’d go for persistent disks the other type of storage is file storage and the file storage option in GCP is file store. I’m sure you’re having a wonderful time and I’ll see you in the next step.

Step 02 – Exploring Block Storage in GCP – Local SSDs

Back. Let’s dig deeper into block storage. In this specific step we saw that there are two popular types of block storage that can be attached to VM instances which are local SSDs and persistent disks. Local SSDs are physically attached to the host of the VM instance. So on a single host you can have multiple VM instances and on the same host we also have the local SSDs. So these are physically attached to the host. Typically, local SSDs are used to hold temporary data and the lifecycle of a local SSD is tied to that of a VM instance. However, the persistent disks are network storage. They might be present on a different host as well and because of that they are more durable and the lifecycle of a persistent disk is not tied to a VM instance.

So you can actually disconnect a persistent disk from a VM and attach it to another VM. Let’s take a deeper look into local SSDs. Local SSDs are physically attached to the host of the VM instance. They provide you with high IOPS. Because they are on the same host you get high IOPS and very low latency. However important to remember is this is ephemeral storage. This is temporary data. Data only persists until the instance is running. So if you want to survive maintenance events so if a maintenance is going to be performed on the hardware or the software of your virtual machine and you’d want data to survive that the only option that you have is to enable live migration. When you are using a local SSD, the data is automatically encrypted but you cannot configure the encryption keys. The encryption keys are Google managed.

You cannot configure them. The lifecycle of an SSD as we talked earlier is attached to the VM instance. An important thing to remember is that only a few machine types support local SSDs. Not all machine types support local SSDs and it supports Scassi and NVMe interfaces. These are the interfaces you can use to connect your local SSD to a virtual machine. A couple of important things to remember the first thing is if you are using NVMe interface then make sure that you’re using NVMe enabled images and if you want the best performance with scarcely then you can go for multi queue scarci images. This would provide you with best performance. Another important thing to remember is if you want better performance from your local SSD then go for bigger SSDs. Larger local SSDs which have more storage implies better performance.

The other thing you can also do is to attach more vCPUs to the VM where the local SSD is attached with even that gives you better performance. So the performance of your local SSD depends on the storage or the size of your local SSD and the number of vCPUs which are attached to the VM. As we discussed earlier, you can attach the local SSD when you are creating a new VM instance. You can go to disks, and over here, you can go over to disks. And this is where you can go in, and you can say, Add new disk. And over here you can choose local SSD scratch disk. And this is where you can choose what type of interface you want to make use of Scassi or NVMe. And you can also decide how many SSDs you’d want to attach. You can see an estimate of the total performance in here. And also you can see the encryption type.

You can see that the encryption type is Google Managed. You cannot change the encryption type. Let’s quickly look at the advantages and disadvantages of local assistance. The advantages are very fast. I O. It provides you IO speeds of ten to 100 x compared to the persistent disks. So they provide you with higher throughput and lower latency. And this is ideal for use cases that need high IOPS while storing temporary information, so caches, temporary data, scratch files. These are ideal use cases for local SSDs. These disadvantages are that it’s ephemeral storage, lower durability, lower availability, lower flexibility compared to PDS. You cannot detach it and attach it to another VM instance. In this quick step, we looked at local SSDs. I’ll see you in the next step.

Step 03 – Exploring Block Storage in GCP – Persistent Disks

Come back in the step. Let’s look at persistent disks. Persistent disks are network block storage that is attached to your VM instance. It is provisioned capacity. You need to configure how much storage you want. It is very flexible. You can increase the size whenever you need it. So if you want to increase it from 100 GB to 100 GB, you can do that while it is attached with a VM instance. The performance of a persistent disk like a local assessment the scale switch size the more storage you provision, the higher the performance that it provides. If you want higher performance, you can actually either resize the persistent disks or you can attach multiple persistent disks with the same VM. The lifecycle of a persistent disk is independent from that of a VM instance. You can detach it from one VM instance

and attach it to a different VM instance. And we already talked about the regional and Zonal options. Zonal percent discs are replicated in a single zone. However, regional persistent disks are replicated in two zones in the same region. Important thing to remember is regional percent discs are costly. They are two x the cost of Zonal persistent disks. A typical use case for persistent disk is to attach something with your virtual machine. Or if you want to run your own custom database let’s say you want to run a custom database of your own then you can have a virtual machine. You can attach multiple persistent disks with it and store the data for the database in the persistent disks. In this quick step, we looked at persistent disks I’ll see you in the next step.

Step 04 – Comparing Persistent Disks vs Local SSDs

Come back. In this quick step, let’s compare persistent disks versus local disks. The first feature we will be looking at is attachment to VM instance. Person disks are on the network. They are attached as a network drive. Our local SSDs are physically attached. Lifecycle of a person disk is separate from that of a VM instance. You can actually disconnect a persistent disk and it can exist even without a virtual machine. And at a later point in time, you can actually attach it with another VM instance. Local SSDs the lifecycle is tied with that of the VM instance.You cannot have a local SSD alive without a VM instance attached to it.

You cannot detach a local SSD from a VM instance. As well, as far as the IOSB is concerned, local SSDs provide you with ten to 100 x of that which is provided by persistent disks. Persistent disks support snapshots. However local SSDs do not So these snap are like backups. Whenever you’d want permanent storage that is attached to your VM, then you would go for a persistent disk. However, if you are fine with ephemerald storage, you can go for a local SSD. In the first step, we looked at the differences between a persistent disk and a local SSD. I’ll see you in the next step.

Step 05 – Exploring Persistent Disk Types

Come back. In this step, let’s look at the different types of persistent disks standard, Balanced, and SSD. Earlier, when we were creating our VM instance and we were attaching a disk with it, we saw that there are three different types which are present in here balanced, SSD and Standard. What are those? What is the difference between them? That’s what we’ll be discussing in this specific step. So, what is the answer? Underlying storage for standard, balanced and SSD. Underlying storage for standard is hard describe HDD for balanced is SSD and SSD is also solid state drive SSD. So Balanced and SSD make use of solid state drives, whereas standard makes use of hard disregards. Technically, these are referred to as PD standard PD. Balanced PD stands for persistent disk. When it comes to performance, we need to look at two different kinds of performance.

One is sequential IOPS, typical big data workloads or typical batch programs. They are interested in sequential I ops. They don’t do random reads. They would actually read the hard disk in a sequential way. And that’s the reason why sequential IOPS is important. As far as sequential IOPS is concerned, standard provides you with good amount of sequential IOPS. Balance also provides you with good amount of IOPS for sequential as well. SSD, on the other hand, provides really good performance when it comes to sequential I ops. The other type of IOPS that you can do are random I ops. For example, if you are actually having a web application, typically whenever you have transactional apps, you are interested in random I O operations. Standard or the high describe option is very very bad at doing random IOPS.

Balanced is good and SSD is really good. An important aspect is the cost. Standard is really, really cheap and Balanced is cheaper than SSD. SSDs are really expensive. Now, let’s look at the use cases. If you want cost efficient big data, then you can go for standard persistent disks. This is because they provide you with good sequential I ops at a low cost. If you want a balance between cost and performance for transactional applications, then you can go for balanced. These are not really expensive, but they give you good amount of performance. If you want the absolute best performance either for big data data workloads or for transactional ads, you can go for SSDs. In this step, we looked at the different types of persistent disks that are present in GCP standard, Balanced and assisted.

Google Associate Cloud Engineer – Block and File Storage in Google Cloud Platform – GCP part 1

Related Posts