Amazon AWS SysOps – EC2 Storage and Data Management – EBS and EFS
- Section Intro
Now that we know all about automation and deployment it is time for us to get back to EC Two and learn about how to store data on EC Two instances. You know what I mean? I mean EFS and EBS. Now, I know that EFS and EBS you may already know them from a basics exam that you’ve done before. But in this section we’ll see them in depth and we’ll see them from a sysaps perspective. What does that mean? Well, that means looking this is up subjects that are very common during the exam. For example, performance troubleshooting operations and monitoring, including cloud watch integrations.
- EBS Intro
Okay, so now we’re going to talk about EBS volumes and you may know already what they are, but just bear with me, you may learn a thing of two. And in this section we’ll go actually do a deep dive on a few concepts. So the idea is that if you have an ECG machine and you terminate it for whatever reason, it will lose its roots volume by default. And so that means that all the data on it will be lost. And so sometimes unexpected terminations might happen from time to time, alice will tell you, but you never or not. And so you need a way to store your data safely somewhere. Okay? You don’t want your main data to be on your root volume, you want it to be on an attached volume. And that attached volume is going to be an EBS volume. It’s going to be a network drive that you can attach your instance while they run and you can persist your data on it.
So you can place whatever data on your EBS volume. So for example, if you have a database, it’d be really smart to place the database data onto your EBS volume. So you already know this, but it’s a network drive. It’s not a physical drive. So think of a USB stick but over the network. So it will use the network to communicate to your instance. So there might be a bit of latency and it can be detached from an easy to instance and attached to another one very quickly as long as they’re in the same AZ. That’s because it’s a network drive.
So because it’s just mapped over the network, it’s luck to an AZ. So by default, a volume that will be created in USD one A will not be able to be attached to a US East one B. Now you can play around and you can move a volume across. And we’ll see this in the hands on in this section. But first we need to create a snapshot of our ABS volume and we’ll see this. Finally, when we create an ABS volume, we need to provide a provision capacity. So we need to say how many gigabytes we want and how much IOPS we want. And you are going to get billed for all the capacity that you provision, not that you use. So if you provision 1 disk but you only use one gigabytes, you’re still going to get billed for 1. That’s super important to understand. So over time, this is a new capability and we’ll see this in the hands on as well. We can increase the capacity of the drive as we go along.
So it’s very smart to start small and then increase bigger as we start using more data for EBS volumes. So quickly, visually used Es one A will have three easy two instance and the first instance has a ten gigabyte EBS volume. The second one has two EBS volumes. Of 100GB and 50GB. The third one has nothing. And then in US East One B, the second instance will have an EBS volume of 50GB. Just an example. But what we can see here is that our EBS volumes are scoped to a specific AZ. I cannot attach my volumes from USD one A to an instance in US East One B, it just does not work. Now, we’ll do a volume deep dive into the next lectures. But right now, just need to know that EBS volumes come in four types. GP Two, which is a general purpose, SSD, which is quite balanced, good price, good performance, IO One, which is highest performance SSD. And then we need super low, latency or high throughput workloads. So usually running a critical database on IO first one, which is going to be low cost hygiene volume.
So it’s going to be more around, getting more throughput and running big data workloads. And Stone, which is a cold volume, which is also for big data but for less frequently accessed workloads usually. So each volume will have characteristics in size throughput IOPS, which is IO per second, I Oops per second. And then when you’re in doubt, always looking at the AOS documentation, it changes all the time. They always improve the EBS volumes. And so my slides may be outdated very quickly. And by the way, when you create an instance, you can only choose GP Two and IO One to be used as boot volumes. All right, now we’re ready. So let’s go in the next lecture in which we’ll be creating an easy two instance and attaching a few volumes. So see you in the next lecture.
- EBS Intro Hands On
Okay, so in my management console, I’m going to go to EC Two, and in EC Two I’m going to go to instances and I will create an instance. So I will launch an instance and I will choose Amazon Linux Two. That’s perfect. I’ll select this one t two microsites free tier legible instance details. I will leave everything as is. This is perfect. Click on next add storage. And here we can see now that we’ve been skipping this all along, but our root volume is going to be Dev xvda and it’s 8GB in size and it’s a GP two volume. But I could choose. I o one if I wanted to. And here we get some information about IOPS. We’ll get more details on this later. And throughputs.
And there’s a tick box saying Delete on Termination. So when our instance gets deleted, the volume will get deleted as well. Now let’s play and add a new volume. We’re going to add an EBS volume device is where you want to attach it. So the point of attachment will just use Dev SDB and we could use a snapshot to restore from. But for now we don’t know what snapshots are, so we’ll just forget about it. We’ll say 2GB in size and it’s going to be a GP two volume. And I can say encrypted, not encrypted. But right now I will say not encrypted. I could tick the box Delete on Termination, but I don’t want to. So I’ll leave Delete on termination. unticked. As you can see, when you’re a feature eligible customer, you can get up to 30GB of EBS volume for free.
So it’s kind of neat. All right, next, add tags. I’m fine with this. I will maybe add a tag and say Name EBS demo just to have one tag. And then click on next configure Security Group. I’ll just select one, maybe Avis SSH the one we have, which is great, which allows me to SSH into it, review and launch. And then I will just launch it. In terms of key pair, I’ll choose my A device course key pair and click on Launch Instances. So now what has happened is that we have created an easy to instance and it has two EBS volumes attached to it. So if we look at the bottom right, right here, the root device is devxvda and the block devices are dev xvda and Dev SDB.
So here is my second EBS volumes that was attached. If we go to volumes on the bottom left, we can see that we have two volumes right here. They inherited the name tag EBS Demo. The first one is going to be the root volume. It’s eight gigabyte in size. And the second one is going to be the volume that I created, 2GB in size. And they’re both in the same availability zone, EU West One B, which is the exact same azu where my instance is. And so here we are starting to get a good idea of how things work. Now we can go into this instance and maybe SSH into it, just to have a look at a few things. So I will run the SSH command and I will add the IP of my machine. Here I am, I am in. And so let’s see what it is. So something you have to use called Lsplk.
And this shows you basically all the attached drives. So we have eggs VDA, which is our 8GB EBS volume that is mounted onto slash. So that’s our root volume. And then we have Xvdb, which is 2GB, and that’s our added EBS volume. And right now it’s not formatted, it’s not mounted anywhere, so we can’t access it. So I’ll just show you how to do this. For this, no secret, we have to go to the Amazon documentation because this is the kind of thing that you don’t remember on this part. And so it just shows you how to do making an ABS volume available for use on Linux. So let’s go and scroll down. lsblk is what we run is to see all the disk available. So we do see that Xvdb has been attached, then we scroll down and we do pseudo file minus s on the device. So pseudo file minus s on the device. So dev Xvdb and this gives us data.
So when we get data, that means that there is no file system on the device and we must create one. And then we go to step four in case there was a file system. We will get this kind of output right here. So let’s go to step four to basically format this file system and we’ll create an x four file system for the volume. So for this we’ll do pseudo makefs minus TXT four and then the device name. So let’s do this right now. Pseudo make FS minus t t is x four. So x four is just a way of formatting the volume and then dev Xvdb. Okay, excellent, it’s done. So you get a lot of output and now we need to mount our directory maybe in a data folder. So we’ll create pseudo makedir and then data.
So we have created a data folder and then we introduce pseudo mount the device name which is dev xvdb onto the data folder. Done. So now let’s have a look at how this works. If we do lsblk now we can see that our Xvdb drive has been mounted to the data folder. And so if I go to the data folder, to the data folder with four slash, and then I will just sudotouch hello TXT here. I just have a hello TXT file being created on my drive. So pretty cool. And we can just say pseudo nano hello TXT just to edit that file. And I’ll just say hello world, and I’ll exit save. Yes. Okay. So as we can see now in my hello TXT file. There is hello world. And that file directly lives on my second EBS volume, which is quite cool. So optionally, you can mount this EBS volume on every system reboot. And for this we need to add an entry to ETCF tab. So we’ll just backup ETCF tab. I’ll just copy this command right here. Okay, now we’re going to open this ETCF tab and then we’re going to add some information. So for this I’ll do pseudo, nano, etcfstab. And here basically you need to add a line.
So we need to add the device name, the mount point, the file system type and some information. So let’s go ahead and do it devxvdb. Then the mount point is going to beata the file system type is going to be x four. And then the mount options is something we can just get from here. So we’ll just copy this default no file two and it’s a bit simpler. And we’re done. Exit and save. Okay, now we’re done. And now basically on reboot of our instance, our disk is going to be automatically mounted to slash data. So we can verify this by just catching this file and looking at the fact that yes, all looks good. All right, so next what we can do as we scroll down is to basically verify that our file system has been formatted. So let’s do this. We can do pseudo file minus s dev xvdb.
And this now says that our Xvdb EBS volume is now x four file system data. Quite cool, quite interesting. And yeah, we’re pretty okay with this. Now we can just unmount data. So we’ll do pseudo unmount. So, umount, data. And that’s basically to and because I’m in the wrong directory, let’s do it again. Pseudo. Unmounted, now my drive has been unmounted. So if I do lsblk now we can see that my drive is unmounted. And now if I do pseudo mount minus A, we’re going to test our F stab file. It worked. And then lsblk, yes, the mount point is now data. So that’s it. It’s just a little bit of stuff to do. But I think it’s quite a cool way of seeing how to attach an EBS volume and format it. It’s something that you should be knowing has its syrups. It’s not something that they might ask you in the exam, but always good to know these things. They help on your understanding. So that’s it for the intro to EBS. I will see you in the next lecture.
- EBS Volume Types Deep Dive
All right, so now let’s do a deep dive into our different instance. Volume types for EBS volumes. So the first one is GP two, and it’s the recommend for most workflows. It’s the least expensive for an SSD, and you can use it as a system boot volume for having really good performance. You can use it for virtual desktop, low latency interactive apps, maybe development and test environments, so you can think of a lot of use cases. The size can range from one gigabytes to 16 terabytes, so a lot and a small GP two volume can burst IOPS up to 3000. Now, we’ll have a long lecture on burst, so you don’t need to remember this right now, but it’s important to know that a GP two volume is like a T two instance. There’s a burst aspect to it.
The max I Ups you can ever get on the GP two right now, as of today, is 160. But that means that because you get three IAPS per gigabyte, and that’s a rule for GP two. That means that when you have a gig and GP two volume of 5334gb, you are already at the max I up. So any size after this increase will not increase the number of IOPS. So let’s go have a look in the console. So here I am in the console, and what I’m going to do is go to volumes. And in volumes I’m going to create a new volume. And here we have a dedicated page to creating volumes. So I’m going to create a GP two. And so let’s have a look. Yes, the main size is 1GB. The last the max size is 16 terabytes, and I can choose a size and gigabytes. So here I say 100, but I could say 50. And so, as you can see, as I decrease my size, my I Ups decreases. So if I have ten, I have 100. I ups. If I have 50, I have 150. I ups. If I have 200, 400, I have 1200. I ups. And so the idea is that you get three I Ups per gigabyte, you get a minimum of 100 I Ups. So even if you set 1GB, you still get 100 I Ups and you can burst to 3000.
Okay? So here I have 1GB 100 I ups and a burst of 3000 I ups. If I go to, say, 1000GB now, I get three I Ups per gigabyte and I have 3000. So there’s no more burst. The burst only happens when you’re below 1000. So nine nine nine would give us a little bit of burst to 3000. And so I can increase the size all the way to 16. Three eight four. This is the maximum size of my IOPS volume. And as you can see right here, I reach the max I Ups, which is 16,000, because it says because my volume is bigger than 5333 gigabytes, I already have the max I Ups. Of 16,000. Pretty cool. Now for the AZ I am locked to one of choosing an AZ. So volume is again locked to a specific AZ. It’s not applicable to see the throughput megabytes per second for the GP two type and then we could create it from a snapshot and encrypt this volume. But for now I just wanted to show you how the size does impact the IOPS and vice versa. Then we have the IO one volume.
And so this is going to be for critical business applications that will require sustained IOPS performance. So when you need IAPS and to be very consistent with it, and usually if you need more than 16,000 IOPS per volume, so which is the GP two limit, then you would have to go to IO One. So you would use this for large database workflows such as MongoDB, Cassandra, Microsoft SQL Server, MySQL PostgreSQL Oracle. Basically when you have a critical database I would always recommend to use IO one. The size can range from 4GB to 16 terabytes and the IOPS is provisioned. So it’s called provision IOPS or pi ups. The minimum you can have is 100, the maximum is 64,000 only for this very specific type of instance called Nitro out of scope for this course. Else the maximum is 32,000, which is almost every other instance has a maximum I ups or 32,000. Okay. And then the maximum ratio of provision I ups to requested value size is 50 to one. So for each gigabyte you can request 50 I ups. Now let’s go have a look at how this is in the console. So now I’m going to choose IO One as my volume type and here I get both inputs for size and I ups. So let’s see, I can say size of 1000 if I wanted to and this wouldn’t change anything. And then I ups.
I can go between 100 IOPS to up to 64,000, but I’ll just put 32,000 because whatever is going to be the max for my instance. So I’m not going to create that volume by the way. But so the cool thing to see is that you cannot have 50 times more I ups for the size. So if I say I want 10GB, the max I ups, I guess it says 500, like automatically it sets 500 if I say 600 because I really want it says no, the maximum ratio is 50 to one permitted between I ups and volume size. So you can’t just have a very small volume with a large amount of IOPS. You still need to increase the size as you increase the IOPS. That’s the idea. The rest of the options are exactly the same and so I invite you to just play around with these settings to see how they work.
Now, a bit less popular in the exam, but it’s still good to know is St One, which is for streaming workloads that need fast throughput at a low price. So we’re talking big data, data warehouse, log processing, apache Kafka this cannot be a boot volume because it has a very specific kind of workload. The size ranges from a minimum of 500GB to 16 terabytes. You get a max I ups of 500. And the idea is that you can still get a lot of throughputs up to 500 megabytes per second and that has a burst as well. So if we go here and go to St One, we see that here the minimum size is 500GB and the max is 16 terabytes. And so I could increase to whatever I want. And so if I say 4000, basically the throughput megabyte per second increases. So we get 40 megabytes per terabytes. And so as I increase my size, so if I go to 10,000, I will increase my throughputs megabytes. So here I get 391 megabytes per second with a burst up to 500.
And if I say a really big volume, say this much, I’ll just get 500 all the way to 500. So this is the maximum throughput 500 megabytes per second. So, still cool to see. Not something you should know on your heart for your exam. But St One is for throughput optimized HDD and they will give you basically a lot of throughputs in terms of megabytes per second. Finally you’re going to have St One and this is going to be for storage of data that is infrequently accessed but still need high throughput. So this is a scenario where you need the storage cost to be quite low and this cannot be a boot volume as well. Still the same size requirements as St One. The max I Ops is even lower, it’s 250. And then what we get is a max throughput of 250 megabytes per second.
And this you also have a burst capability. So it’s like a less good SD one, obviously cheaper. So again, we could play here and go to S C one. And here we can see that if we set the size to be 200, not 2000, then we get a throughput megabytes of 24 baseline and 157 burst and you can just play around with the size. I’m not going to spend too much time doing this, but have a play. So that’s it for the EBS volume types. Now, what you really need to remember is that as soon as you reach an IO limit on GP two type of volumes, then I O one is going to be the answer at the exam. Because you can go over 16,000 I ups, you can go all the way to 32,000 I ups, or even 64,000 I ups. Now, using Nitro instances. So remember the difference volume types, super important. I hope you liked it and I will see you in the next lecture.