Amazon AWS SysOps – EC2 Storage and Data Management – EBS and EFS Part 2
- EBS Volume Burst
Okay, so now there is this concept of burst that I’ve mentioned before on GP two volumes. But now we’re going to formalize this. So if your GP two volumes is less than 1000GB, that means that your IAPS is less than 3000 because it’s three I Ops per gigabytes for GP two, then it can burst. And burst, that means it can achieve 3000 IOPS performance even though you had less than 1000GB. So this is similar concept to when we have t two instances with their CPU, when they’re able to get a better CPU for a limited amount of time because they can just burst. So this is the same thing for GP two volumes. So what we have is that our GP two volumes will accumulate burst credit over time. And so our volume will have good performance when we need it.
So even if we have a very, very small 100 gigabyte volume, so we get 300 I ups, then it is possible for a brief amount of time for it to have the same as having 3000 I ups performance, and so to still be very good when we need it to be. And so obviously, the bigger the volume, the faster we’re going to fill our burst credit balance. And so what you may be asking yourself is what happens if I empty my I O credit balance? For Et two instances, they were becoming really crappy. But for your IO volumes, basically you get the exact IO you paid for. So if you paid for 100 I ups, and this is what you’re going to get. But if you paid for more, you’re going to get more, but you don’t get less than what you paid for. And if the balance is zero all the time, so if your credit balance is zero all the time, then you need to absolutely increase the GP two volume size or switch to IO one because otherwise you’re going to have a volume that is underperforming, obviously.
So to monitor the IO balance, we can credit balance. We can use Cloud Watch, for example. And so that’s quite interesting to know, and the Sysaps exam will ask you about it. So what happens is that if your GPU volume is more than 1000GB, then you can just forget about the concept of burst because there is no such thing for greater than 1 volumes. And then finally, burst also applies to SD one and SC one for increase of throughputs. Something just fun to know. All right, let’s go see how it looks like in the console. So here are our two EBS volumes. One is a gigabyte, one is 2GB.
And so if I click on the eight gigabyte one and I go to monitoring, as we can see, we get some interesting stuff with Cloud Watch. And if we scroll down, well, we get the burst balance percentage. And so that represents basically your credits. And so right now it is 100%. All the time constant, the burst balance. So that means that I have a lot of chance to basically have I high ups if I need to. Up to 3000 I ups. Even though my eye ups that was provision is 100, we get the exact same basically graph for this one. And basically, why is it staying all the way at 100%? Well, it is because we are not doing much with our EC two instance. We’re not running a database, we’re not running any I O workloads. So obviously this will stay at 100%. But if I were to write a big file, hopefully I should see this burst balance going down. So let’s have a look. So here’s just a random blog on which I can show how to basically measure IO. So what we have to do is install FIO. So I’ll just do pseudo yum, install FIO. And now it’s a yes install.
Okay, now FIO is getting installed. And here we can just basically test random read write performance. So we’ll just execute this and then the file name is going to be one FIO. So I have to go to my I can stay in this directory. Actually, I’ll just go to my home directory. Okay, here we go. And I’m going to run this command right here. So I’ll paste it and press enter. So this command is just going to create big files for me. Like, it’s going to be a four gigabyte file. And this disk is 8GB, so it’s going to be fine. So it’s basically going to do a lot of I O operations for me. And the cool thing is that in Cloud Watch, I should be able to see it after five minutes. So let’s wait a little bit. So now we can see the job progressing. And as we can see, we have an ETA of five minutes. So I’ll just wait five minutes to see what happens. But right now, this just shows that there are a lot of reads and writes that are happening on our disk, which is good. Okay, excellent. So I’ve just done my IO test, so it’s pretty cool. Now if I go back to my EC two management console, what I hope to see is to see a little bit of information around the CPU burst balance, the volume burst balance.
So let’s have a look into monitoring and refresh. As you can see, I tried my best, but as we can see here, there’s a small downtick here. So we can see that the burst balance was being used, about 5% of it. So you have a lot of chance to run a lot of different IO on your volume and still see the burst balance deplete. But I hope that’s cool. I hope you understand why we did these things. Maybe I’ll wait a little bit to see if the graph goes down more.
But overall, my point is that as soon as you start using your volume. And as soon as you start utilizing IO, then you’re going to see Cloud Watch metrics go through the roof. For example, the read bandwidth, the ride bandwidth, the read throughput, the wet throughput. And so this will automatically, as you see, make your burst balance depletes even more. And so if you hit zero, this is when your volume becomes very, very basic and just you get what you paid for. So I hope you liked it. I hope that makes sense for you around where burst balance is and how it works. And I will see you in the next lecture.
- EBS Computing Throughput
Right, so just a quick lecture on the computing of throughput based on the IUPs. So for GP two, we get the throughput in megabytes to be equal to the volume size in gigabytes times the IAPS per gigabyte that has been, you get times the IO size in Kilobytes. So it’s a formula you have to remember, but so we’ll see how that works. So, for example, we have 300 IO operations per second, times 256 IO operation, that’s 75 megabytes per second. So how do we get there? Well, if we have a 100 gigabyte volume size, we get 100 times three I O operation.
So it’s 300 IO. And then for the I-O-S and Kilobytes, basically the max IO you can ever get on Amazon EBS is 256 operation. So by doing this quick math, 100 times three, times 256, we get 75 megabytes per second. The limit of number of throughput you may get on GP two is going to be 250 megabytes per second. That means that if your volume is as a size greater than, and let me get this right, 330 and 34GB, that won’t increase your throughput even further. So the max throughput you get on a GP two if you use large iOS is going to be 250 megabytes per second. For IO one, it is the exact same formula. So the idea is that you count how many IOPS you’ve provisioned and you multiply that by your IO size in Kilobytes. And so the volume limit for IO is going to be 256 KB for each provision IOPS. And so you just do the math.
So if you have 1000 provision IOPS times 256 Kb/second, that’s going to give you the exact 225 megabytes per second. Just did the math in my head, but so the limit you get when you do this math is going to be a limit of 500 megabytes per second. So this is how much I ups. You can get at 32,000 I Ups the maximum. Or if you have 64,000 I Ups, you can go all the way through to 1000 megabytes per second. But that is a bit tricky to get and you get much smaller I Ups. So look at the documentation. They will give you much more details. But the idea is that you need to be able to just multiply the number of IOPS you have times the IO size in Kilobytes and that will give you the throughput in megabytes. And that’s what the exam would expect of you. All right, that’s it. I hope you enjoyed it and I will see you in the next lecture.
- EBS Operation: Volume Resizing
Okay, so since February 2017, we can resize our EBS volumes and that means we can only increase the EBS volume size though. So we can increase the size and we can increase the IAPS. If you are in IO one, after resizing your EBS volume, you’ll need to repartition your drive. You have to and after increasing the size, it’s possible for the volume to be in what’s called an optimization phase. During that phase, the volume is still usable, but you may not get the ideal performance. So let’s have a look on how this works. So if we have a look, for example, we take our two gigabyte volume in there, and what I’m going to do is right click and modify the volume here.
As you can see, I can change even the type of volume. So I can make it an IO one, I can make it an SC one or SD one. I’m able to do whatever I want. I’ll keep it as GP two and I’ll say, okay, I have four two gigabyte right now, maybe I’ll make it four and that’s fine. If I wanted to make it one, it would say impossible, you can only increase the volume side and not decrease it. So maybe I’ll make it five just to make it a good number five. And what I’m going to do clue and click on Modify. Now it says are you sure you want to modify this volume? It may take some time for the performance change to take full effect and you may need to extend the OS file system. We say yes, we know how to do this and okay, so now, as you can see, the volume after refreshing very quickly, is going to be in use modifying and so there is a percentage that you can follow.
You also get this information right here. So we have to wait a little bit. I’ll refresh and it’s already 5GB. So that was very quickly because it’s a very small volume and now it’s in state of optimizing. So we get 11% optimization. So it will optimize and during this period of optimization, even though I can still use my volume, I may not get the best performance. The optimization can be quick based on the volume type you get. But so what I want to show you is that if we go to our EC two instance and do lsblk now, what we see is that our disk is now 5GB and it’s mounted to the same thing. So automatically the Linux file system was able to see that it became 5GB.
Although if I do DF minus H DFH is a way to basically see the partitions, we can see that our Dev Xvdb still only has 2GB assigned as a partition and so we would need to reformat the partition, extend it. You can find the instructions online. That’s not the point for the Sys Ups exam, unfortunately, but the idea is that you can extend now this two gigabyte partition all the way to 5GB space. And that’s the whole idea. So cool. I hope you enjoyed it. I hope you understand how you can resize your volumes. It is tremendously important to know how to do this as a sys apps and to be ready for the exam to know that you can resize your volumes without detaching them from your EC Two instance. So you can resize the volumes while your EC Two instance is still in use. And that’s perfect. So hope you enjoyed it and I will see you in the next lecture.
- EBS Operation: Snapshots
Okay, so now let’s talk about EBS snapshots, because this is a very important example. So the snapshots are going to be incremental. It’s only going to backup change blocks, and the backups will use your IO of your EBS volumes. So you shouldn’t run the backups while your application is doing a lot of things, a lot of traffic, you should back up, usually when you have a bit of downtime or when it’s quieter. The snapshots are internally stored in S three, but you won’t directly see them m, but you get billed just for the S Three usage, s three rate for your EBS backups or snapshots. You don’t have to detach your volume to do a snapshot, but it’s recommended to do so, and you get a maximum 100,000 snapshots per account. Good to know. You can copy your snapshots across Azure region, and we’ll have a look at how to do this. And we can create Amis directly from a snapshot. So very nice. EBS volumes that are restored by a snapshot, though, you need to prewarm them.
So we can use the FIO command that we saw, or DD command, basically, to read the entire volume. And that will prewarm the volume, make sure it gets its optimal performance right away. And the snapshots can be automated using Amazon Data Lifecycle Manager. It’s a very little known option, but I’ll show you it in a second. Okay, so let’s say I want to snapshot my 5GB EBS demo volume. And that one, if you remember, just has a Hello World TXT file on the data for volume. So I right click on it and create the snapshots. I’ll just say backup of my EBS volume, whatever you want to name it. You can add some tags to it, and then you click on Create Snapshots.
And here we go. A create snapshot request has succeeded. So what we see here is that the snapshot is starting, and so it’s going to back up only the blocks that are used. So right now we have to wait, and the backup may take a little bit of time. So we have the progress on the right hand side as a percentage. And so I’ll just pause until it is done. Okay, so my snapshot has now been created, and from it, I can delete it. I can create a volume, so I can create a new EBS volume straight from this snapshot.
I can create an image so I could create a whole AMI if I wanted to from this snapshot, which is super fun. And then I could copy it to a different place so I could copy it to a different region if I wanted to. And I could also encrypt it. We’ll see this in a second, but so that’s what I want to show you with the snapshots, it’s quite easy. As you can see, I did a snapshot while my EBS volume was still being used, and it was still working so good to know. Lastly, something that’s not well known, but I like to show, because I want you to teach, to learn real world skills and not just learn how to pass the exam. If you go to Lifecycle Manager below snapshots, you’re able to, and you’ve probably never seen this before, you’re able to schedule and manage the creation and deletion of EBS snapshots.
So we can create a Snapshot Lifecycle policy, and I’ll just call it my policy. And here we can say, okay, we can tag volumes which have, for example, the name equals EBS demo, and we could automate the snapshots of all the volumes that are named EBS demo. So you could go crazy with your tags and do whatever you want. Then you can define your schedule. So you say, I want to create snapshots every X number of hours to twelve or 24 when you want to start your snapshots. And basically you should choose when it’s going to be low traffic for your application, the retention rule.
So the number of snapshots that will be retained, maybe I want to retain only seven snapshots and back up every day. So that means that if I do 24 hours, as it says, all the snapshots will be retained for seven days old and we get a maximum of seven snapshots. Then we can also copy the tags from the volume if we wanted to and add more tags to the snapshots if we wanted to as well. Finally, we just specify an IAM role to perform that snapshot action. And that’s it. And so the really cool thing is that using this policy, which I won’t create, you would have an automated backup solution for your EBS volumes. So very easy to do, no need for AWS Lambda.
You can just use this tool called the Snapshot Lifecycle Policy. And so something good to know as a sys apps to save you a lot of time and frustration. So I hope you enjoyed it. I will see you in the next lecture.