Amazon AWS SysOps – S3 Storage and Data Management – For SysOps (incl Glacier, Athena & Snowball)
- Section Intro
Welcome to the second part on S Three. This part is dedicated to learn all there is to know about sys ups. Now, S Three, as you can see, is very complete, and we’ll learn all the technologies from within this box. This section is quite intense, but it is so important to go over it because the exam will ask you so many questions about S three. Some of them are very tricky. So what are we going to learn? We’re going to learn about versioning, but in depth this time. MFA, delete. Default Encryption.
S three access logs. Cross region replication. Preston URL cloud Front Inventory. Storage tiers lifecycle rules. S three analytics. Glacier. Snowball storage. Gateway. And Athena. As you can see, this section is full of learning, and it will be very complete. So take your time, feel free to revisit any lecture, and I will wait for you in the next lecture.
- S3 Versioning Advanced – for SysOps
Okay, so let’s go a little bit more advanced with S Three versioning. And I want to show you a few things that you need to know as a sys. Ups number one is around creating a delegate file. So we know that S Three versioning will create a new version each time we change a file that we already know that includes as well when we encrypt a file. So it’s super nice because sometimes hackers want to encrypt your data and then give you a ransom and and basically they will decrypt it after you give them some money. And so if you have S Three versioning on your bucket, then even if though they encrypt the file, they can’t encrypt what has already been written. It will just create new versions of the file from it so you can still recover your data because you have S Three versioning enabled. We’ll see this in the hands on in a second.
The other thing is that if you delete a file in your S Three bucket, as you know, it just adds a delete marker onto the versioning of that file. And so if you try to delete the bucket altogether, you need to first remove all the file versions within it. And that’s something that we don’t really know. And I want to show you this in the hands on as well. So let’s get started. Okay, so I’m going to create a bucket right here and I’ll just call it S Three versioning demo. Stefan probably. Here we go. So I’ll just have this bucket and the region is going to be Ireland. And then I’ll click on next. I will keep all version of the nugget within the same bucket. So I tick versioning and the rest I will leave as is. I’ll click on Next and then I will just leave everything like here as is as well.
Click on Next and then create my bucket. So here we go. My bucket is created s reversioning demo stiff on. And so here and there I will go to upload some files. So I’m going to click on Upload, add files and then I’m going to add for example, my beach picture. Click on Next, click on Next, everything looks good. Click on Next and then upload. And here we go. My beach picture has been uploaded. So as we can see here, it’s been modified like here. And if I click on it, as we can see, the encryption is set to none. So what if I want to encrypt that file? Well, I click on Properties and basically for that one file I will click on encryption and I will say, okay, I would like to get AES 256 encryption, which is the S Three server side encryption to encrypt my data.
So I’ll purposely encrypt it, click on Save and go back to my overview, to my demo. And so if I click back on Beach JPEG now we can see that the encryption has been AES 256. But what we don’t realize is that because we have versioning on, we actually have created a new version of our file that is encrypted. So the past version still says encryption none, but the new version says Encryption AES 256. So the concept of encrypting a file onto a bucket that is versioned makes a new version appear. So this is something you have to realize. Now, if I go and click on my file and I’ll right click and for example, I’ll delete it and say delete. As you can see, my file is gone, my bucket seems to be empty. But if I click on versions show, I can see that actually what would happen is that my file is still here, just a delete marker was done onto my file. So the thing that’s super interesting to see now is what I’m going to show you in the CLI. So I’m going to do AWS S three, LS, and this is all my buckets.
So I’ll do S three LS, S three, and then the name of my bucket, which is S three versioning demo stefan okay. And as you can see, when I do this command, it’s returning nothing. Nothing is in my bucket. So something I want to block to do maybe is S three remove buckets. So RB and then I’ll include S three and then the name of the bucket.
So I want to remove that bucket and it says remove bucket failed. An error occurred. Bucket not empty. So the idea is that even though LS returned nothing, so you can do LS and you don’t see at all any result from it. When you do remove bucket, it doesn’t work. It’s saying it’s not empty. So LS, the trick is that LS, even though it’s supposed to list the system, only list the files that are not versioned not deleted versioned.
So the idea is that to make it very, very clear, when I do versions hide, I get the same outcome of buckets empty as LS. But if I do show, it seems like there’s actually some stuff happening in there. So what I want to do maybe is actually delete all these versions. So I have to action and then delete all these versions. So I have to delete all the version ID of my files, click on delete. And now what I should be able to do is to remove my bucket.
And now it’s succeeded and it says remove buckets. Demos define. So what I want to show you here is that even though LS doesn’t return any file, doesn’t mean that there is no files in your bucket. That means that maybe the file has been versioned and deleted. Maybe there’s a delete marker in there. So once you understand this, I think you really understand how versioning works. So I hope that makes sense for you. And I will see you in the next lecture.
- S3 MFA Delete
All right, you are going to talk about MFA delete indepth. So MFA delete is to use MFA to multifactor authentication and that will force our users to generate a code on the device could be your mobile phone or your hardware key to do important operations on S Three. So to use MFA delete we have to first enable versioning on an S Three bucket. But you already know this. And when we need Nfmfa will be to permanently delete an object version and suspend versioning on the bucket. So these are like the most important destructive action that we’ll need MFA for.
But if we just enable versioning or list deleted versions or just delete a version by just adding a marker, this is fine. We don’t need MFA for that. The one important thing to know is that MFA delete must be enabled or disabled only by the bucket owner, which is the root account. So even if you have an administrator account, you cannot enable MFA delete. You’ll have to use the root account and on top of it, because it’s really not easy, we have to use MFA delete only using the CLI for now. So it’s really, really hard to set up. But I’ll show you how to do it. And for this you need to use root credentials. And there is no way of doing it in the console right now. It only has to be done through the CLI. So let’s go ahead and walk through this. But you don’t have to do the hands on with me. You can just web me because it’s really clunky and painful. But the idea you understand is that only the root account can enable and disable MFA delete and that you’ll need MFA only to permanently delete an object version or suspend versioning on the bucket. So let’s get started with the hands on. Okay, so let me create a bucket and that bucket will be called MFA demo Stefan.
And I will click on Next for options. I have to enable versioning. Otherwise I cannot use MFA delete. So I’ll have to enable versioning. Click on next, click on Next and then click on Create Bucket. Okay, so my bucket has been created and as you can see, I can upload a file. So I’ll just upload quickly a file of a coffee and upload and then that file I can safely delete. And now it’s been deleted. But we know already from versioning that it’s just created delete marker. But I could go ahead and delete that delete marker and nothing would complain.
And now my file would actually just come back. So now we’re going to enable MFA delete and see how we cannot do these things anymore. So the first thing you have to do is go to the IAM management console and basically you have to get a lot of things from yourself. So to get there, you go to your course and you have to be in the root account logged in and you click, click on my security credentials. And then you get to the screen and so you’ll get a prompt. And so the first thing you have to do is make sure that you are linked to MFA device. So for me I am.
And I have to use the serial number in the future. So I have to keep that in mind. And then the other thing I have to do is to generate access keys. So this is root access key and I do not recommend at all for you to generate those. But for MFA delete we have to use root access keys. So I’ll go ahead and create new ones and I’m going to show them to you.
But it’s fine because I’m going to delete them right away after the end of this journal. But the idea is that these are super important not to ever generate or keep. So if you just generate those it’s to enable MFA delete and then you’re done. So for enabling MFA delete we have to go ahead and actually run a few commands. So the first one is going to configure our CLI. So I’ll configure my CLI to use a profile and I’ll call it root Data cumulus as my company. So I’ll generate basically a profile on the CLI for this. So I’ll copy my access key copied. And I’ll copy my secret access key entirely. So make sure you copy this entire thing.
Here we go. And the default region name is going to be EU West One. Okay so now my profile has been defined. And so basically anytime I run a command, for example S three LS and I add minus, minus profile root data cumulus, I’m going to be using these credentials. All right. So we see we have our MFA demos defined in here. Now I have to go ahead and run a very complicated command which I just will copy here. So it’s called S three API, puts bucket versioning. Then I have to give the bucket name. So here you would have to oops, you would have to modify your bucket name based on what you named it if you wanted to do this tutorial. Then for versioning configuration we say status enabled and MFA deletes enabled. So this basically will enable MFA.
Then you have to specify the ARN of your MFA device. So this is something that you have to get from 1 second. I have to get it from my MFA here. So I’ll have to copy this host, your number so it’s copied. And then you have to specify an MFA code. So this is where it gets tricky. You actually have to open your MFA code application. So I’m opening it right now. And then you go on it and you find your code. So I have to type mine quickly. So 001712, press Enter and now it’s done. So MFA delete is enabled. Next, back in the AWS console. Here is my coffee PNG file. So I’m going to delete it. And as you can see, this works. It was deleted. But it’s fine because it wasn’t a permanent delete. It was just a delete marker. But now if I want to go ahead and actually delete that delete marker or delete whatever version of the file I want so I’ll do action. Delete, delete.
Nothing happens. That’s because I’m not using MFA. And so because I’m not using MFA, well, it’s not being able to delete the files. And to use MFA you have to actually use a CLA as well. Which is similar to what we did, but through the console you can’t even do this. So the idea is that people cannot delete any files right here because they’re protected against MFA delete. Same if I wanted to disable versioning and click on Save, it says MFA delete tag required when MFA delete is enabled. So we’ve enabled it and we can’t disable versioning anymore. So the idea is that now it’s impossible to delete a file permanently thanks to MFA delete. If I wanted to disable it, I would go back to my command and I would just say MFA delete equals disabled. So disabled.
And then I just have to find the right code for me right now. So let’s see what the code is for my MFA. So it is 6744-7075. And now it worked. Okay. And now I’m going to go to for example, one of my file. And I can click and delete. Click on delete. And now yes, the version ID is just gone. So now I can delete all my files right away. I don’t need MFA delete anymore. And I could also suspend versioning if I wanted to. So that’s it for MFA delete.
As you can see, it works great. But we need to use the CLI and it’s a bit hard to do. And absolutely after you’ve done this, please, please delete these credentials. So delete right away. Yes. So that no one can use your root account credentials. It is really bad to have root account credentials accessible. But we have to use them for this lesson. But delete them right away. So now even if you saw mine, I’m fine. You won’t be able to use them anyway. So that’s it for this lecture. I hope you enjoyed it. I hope you understand better how MFA delete works. It’s a bit hard to use, I’ll be honest. But it’s good to see it once. And I will see you in the next lecture.
- S3 Default Encryption
Okay, so just a very, very short lecture on S Three default encryption and bucket policies. So before a long time ago, maybe two or three years ago, the old way to enable default encryption was to use a bucket policy and you would refuse any Http command without the proper headers. So for example, you would say here, I’m denying any put request as long as there is no header saying that there is going to be server side encryption of AES 256. And then you would add another one saying, I’m going to deny anything that is unencrypted. So basically you’re saying, okay, I want to enable encryption and force it to deny requests if they don’t have the right headers, which will indeed force encryption.
But the new way is way easier. You just have a default encryption setting option in S Three, and that’s tick and you’re done. The one thing you need to know though, is that bucket policies, they will be evaluated before the default encryption settings. So let’s just quickly have a look at this in the console so it makes more sense to you. So if I go ahead and create a bucket and I’ll call this bucket Encrypted Stefan and I click on Next. As you can see in there, there is a default encryption setting. So you can automatically encrypt objects when they’re stored in S Three.
And for this you would say either I want to use S Three, SSEs Three encryption, or you can use Kms and you would select a key of your choosing. So for example, if I want to use AES 256, I click on Next, next and create bucket. And my bucket is now created with these settings. So if I click on my bucket and go to properties in the bottom left, I should be seeing that default encryption is also selected to be AES 256.
Here, it just gives me a little bit of a warning saying that S Three will evaluate and apply bucket policies before applying bucket encryption settings. That means that if you did have any bucket policy, but we don’t have any bucket policy right now, these bucket policy will be evaluated before the object gets encrypted. And so bucket policy, remember, is the old way and default encryption is the new way. But you need to remember that sometimes you may have a question showing you a bucket policy.
You just need to look at that policy very clearly and make sure that it has the right headers. Otherwise, the easy answer to the question how to enable default encryption is to just use the default encryption setting in S Three. Now let’s just test it out, make sure it works. We’ll upload a coffee file and we’re doing an upload. It’s unencrypted.
But what happens is that automatically, yes, the encryption will be AES 256 automatically, thanks to the default encryption settings. So let me show you again. If we do add a file now, we add the beach, and I’ll just click on next just to show you exactly what I mean. Click on next and you say you see encryption. No, it says no encryption here. When I upload it, the beach file will come with encryption. AES 256. Well, thanks to the default encryption, obviously. So that’s about it. That’s all you need to understand about encryption. Just remember to clean up your files and clean up your buckets when you’re done. And I will see you in the next lecture.
- S3 Access Logs
Okay, so now let’s talk about Amazon s three access logs. So say that for audit purposes, you want to log all the access into your S Three buckets. So that means that any request that is done to Amazon is free from any accounts authorized or denied. You want it to be logged into another S three bucket. So you can analyze it later. So you can analyze it, for example, using Data Analysis Tool or something we’ll see in this section called Amazon Athena. So here is the idea with the diagram. We make requests into a bucket, and that bucket has been enabled for logging into another bucket, a logging bucket. And so all the requests, once we’ve enabled the Sree access logs, will log all the requests into the logging buckets. Very, very easy, very, very simple.
And the log format is defined here. So if you’re interested about how we can read this log, just click on this link. Okay, now something thing you need to know about these logging buckets. That is pretty natural, but you need to know about it once. So do never, never ever set your logging bucket to be the bucket you are monitoring. Otherwise, if you set the logging buckets and the monitoring bucket to be the exactly the same, then it will create a logging loop and your bucket will grow in size exponentially. So it’s very simple to represent it. So say we have a bucket, it happens to be our application bucket, and also the bucket that is going to receive all the logs.
And so therefore, whenever a user puts an object, the bucket is going to lug inside of itself, create a logging loop, and they will create a new object that will be logged and a new object that will be logged. And so it will create an infinite logging loop. And so that’s where your bucket will grow in size exponentially. So my advice to you is do not try this at home. This will end up in a huge AWS bill for doing this little mistake. So always separate your application bucket and your logging buckets. Now let’s go into hands on to see how this works. So, I’m first going to create a bucket, and I’ll call it Stefan s three Access logs. And this is where all my S three access logs will go. Click on next. And the rest, I’ll just keep it as is and create the bucket. So now I have my s three access logs. Bucket excellence. And I’m going to create a bucket, and I’ll just call it my sample Bucket. Monitored, monitored. Here we go, Stefan. All right, click on Next, next and create bucket. So now my bucket is created, and what I’m going to do is go to properties and I will turn on Server Access Logging. So I’ll just enable logging. I have to choose a target bucket and I’ll choose my Stefan s three access logs. Bucket And I can enter a target prefix. I won’t do this right now. Click on Save and here we go.
Anything that’s done now to my bucket will be basically logged. So if I enable versioning and then maybe I’ll go to Overview and I’ll upload a file, maybe I will upload my coffee file and maybe I’m going to delete my coffee. So I’ll just delete it. Maybe I’ll upload a beach file. Okay. And maybe that beach file actually, I’m going to encrypt it. So I’m going to change the encryption and say now it’s a yes. 256. Click on change. I’m going to show the versions. Here we go. So I’ve done a lot of things. And so what happens is that all these things that I’ve done should be logged directly into another S Three bucket.
So if I go to Amazon s three and go to Stefan s three XS logs. As you can see, there’s nothing right now, but I’ll just wait a few minutes and when I come back, usually after lunch, I should see something in there. So hang in there. So it took about 2 hours, but I have a lot of logs that just reached me in the S Three bucket that has my Sree logs. And so let’s just take one of these logs and download it to see what it is about. So I’ll click on it and open it my text file. And so what I can see is that a request was done. So put object on beach JPEG.
And this is the IP was made from. We have the time at which it was made. There is some security token involved in the entire request. We get some information around, some signature, the response codes. It was 200 that I mean, that worked. And then the URL that was required. And then some information about the browser. So we get a lot of these text files and they’re pretty structured. And so we can use Athena in the next lectures to analyze this kind of files and make sense of it. And maybe look at, for example, denied request. So that’s it for s three logs. Just remember that they can be enabled for a specific bucket and lend in another bucket. And you could use prefix as well if you wanted to have all these logs with a certain prefix. And it could take a few hours to arrive in the logging bucket. All right, I hope you liked it. I will see you in the next lecture.