Amazon AWS DevOps Engineer Professional – Incident and Event Response (Domain 5) & HA, Fault T… part 8
- DynamoDB – Review Part II
Okay, so say that we want to have a list of all the changes that happen to our table. So it includes updates, deletes, writes, new reads and so on. Not actually reads, just writes and updates and deletes. So for this, we can enable streams. And to enable streaming, we just click on manage stream and we say, okay, we want to create a Dynamodb stream for new and old images. That means that every single item, whenever there’s an update to delete or a new thing, I’ll get the new new value and the old value for it, which is great. I’ll enable this. And here we go. We have a stream. And so using this stream, what we’re able to do is hook this up to lambda function and have a lambda function read whatever is going on in this stream.
So for this, very easy, you can do create a lambda function. So let’s go into lambda. oops, let’s go into lambda and create a quick function for this. So back in lambda, I’m going to create a function and I’m going to use a blueprint and I’ll type Dynamodb. And we’ll say Dynamodb process stream. python. Here we go. We’ll take this one configure and I’ll call it Lambda Dynamodb stream. And we’ll take a new role from the aviation policy templates. And the role name will be dynamodb role for lambda and the policy template itself, I’ll just type in Dynamodb. Dynamodb harness permission, test harness permission. I think this should do it. We’ll see. And the trigger is going to be on the demo table, the batch size 100.
That means we can get 100 items at a time, the time window together. So zero means as soon as possible, whenever we want to start reading from. So it’s latest. And what this fruit reminds you of is that a dynamodb stream underlying is a kinesis stream and it will be super important to remember. And then we need to enable the trigger and to have the proper permissions. So hopefully this is good. And the lambda function code will just print the events into the log. So let’s create this function. So the lambda function was created, but it seems that it cannot have a trigger from Dynamodb. That’s because it is missing some permissions. So let’s go into iam. Let’s find this im role right here.
And we’re going to add Dynamodb permissions very quickly. So I’m just going to say Dynamodb in here and we’ll give it full access just so that we demonstrate that it works. Okay, so back in here. Now I’m able to add a trigger and this trigger is going to be from a Dynamodb table. And that’s our demo table. And everything is the same. We’ll click on add and now hopefully, they should work. So right now this still doesn’t work. Let’s just wait a little bit. Maybe it takes time for the new permissions to replicate. And after trying one more time, everything has worked. So now we have the Dynamodb triggering into the lambda function.
And that lambda function, what we will do is that it will log whatever it receives from Dynamodb into the logs. So let’s just try it out. Let’s go to Dynamodb. I’m going to create an item. It’s going to be B, C and D. And I’ll click on Save. And I’m going to take this one and I’m going to edit it. And I’m going to say, now you’re not V or B. And click on save. And you say, you cannot modify the unique keys. That’s right. So that’s absolutely correct. So I’m just going to change one attribute, which is oh, and here we go. So we’ve just made a new element, an update, and I’m going to maybe delete this one. So I’ll delete it.
Here we go. And so now if we go to lambda and I go to monitoring and click on view logs and cloudwatch logs, hopefully we should be seeing the logs of our alias function in here. Perfect. And we start logging all the stuff that happens from the Dynamodb stream so we can see that this one was an insert. And then we can scroll down, we can see a modifier. So this is when I updated the record and we can see the old record and the new record by new image and old image. And then finally, there is a remove that happened as well. So everything has been logged. So this Dynamodb stream is perfectly working. So why do we need a Dynamodb stream? Well, we need a Dynamodb stream so that we can react in real time, maybe with a lambda function to changes, maybe send email notifications and so on.
Or we can do other things with it. This is kind of cool. What we can do, for example, is go ahead and enable global tables. So by enabling global table, you need to make sure that the table is empty and the Dynamodb streams are enabled. So we have enabled Dynamodb streams, but we need to delete this element first to make it work. And so with global tables, we are able to create a replica table of this table in another region. So let’s add the region. And I’ll choose EU London. And the region is ready. I’ll click on continue to proceed. And it’s going to create a replica table. And now my table is going to be absolutely global. What that means is that when I do, a change in one region is going to be replicated to the other region and vice versa.
And this can really help when you want to access the latency. So fail to assume this role after role creation. Please wait a few moments and retry. So let’s wait again, seems like the API of Dynamodb today or not with me, but this is fine. We’ll just retry until this works. And after a few retries, this has worked. So now I can click on go to table, and this is going to open my table in another region and so we can test out that it’s working. So in the items right now, there is nothing, and in the items right here, there is nothing either. So let’s just create an item. So a, B and C, we’ll save this. So this was done in Ireland and now I’m going to London and refresh this. And very, very soon I should start seeing this item.
So we need to wait maybe for the replication process to start. That can take a little bit of time. And here we go. Now we can see that my ABC is here, but now we get some more information, more attributes. So whether or not it’s being deleted, when it was updated, where it was updated. So when is here and where is here? So this update was done from EU west One. And if I create an item here, b, C and D and save this. So this item has been created in EU west Two. And so what should happen is that if I go here and refresh this table, now I see my second item, as we can see, and this bcd was created from EU west Two. So with Global tables in dynamodb, we have two ways replication.
Between these tables, we have replication from EU west One into eus Two and eus Two back into EU west One. So these are truly global tables for us. Okay, so what else can I show you? Oh yes, there’s one last thing that I need to show you, obviously. So we have enabled dynamodb streams. So if you go in here, we have enabled the stream, and that stream allowed us to have Global Tables. Okay? But we saw also that we can have a lambda function reading from that stream. And so what I could do is create a second lambda function to read from that stream and there would be two lambda functions. And this is fine, because that stream is underlying for kinesis.
And so with kinesis, you can have one megabytes per second of write and two megabytes per second of read per shard. And this is the same with dynamodb. But if I went ahead and created a third lambda function on the same dynamodb stream, then things would not work. Why? Because we have three lambda functions requesting from the same shard. And so we may get throttling issues. Okay, so this is documented in capturing table activity with dynamodb streams. If you look for throttling, it is saying that no more than two processes at most should be reading from the same stream shard at the same time. Having more than two readers per shard can result in threadling.
So this is why we cannot have three lambda functions on the same dynamodb stream, otherwise we may get some threadling. And so the way to have three lambda functions read from the same dynamodb stream is actually we’ll have only one function of Dynamodb read to Dynamodb Stream. Maybe that lambda function will write to an sns topic. And from there we can have as many lambda functions as we want. Reading from that very same sns topic, that is definitely a good pattern. Okay. And then the last feature I want to show you is around ttl or Time to Live. So for this, let’s create an item. And I’ll call this def. And I’m going to add a new attribute. I’ll call it ttl.
Okay? And this attribute actually is going to be a number. So let me remove this one. I’ll just remove this one called ttl, a number. And I need to put an epoch number. So right now, if I refresh this page called Epoch converter. com, I can get the current epoch time. And so what I can do is I can choose a date. For example, I’ll choose 1 minute from now. So it was 38. Now it’s at 39. Okay? And I’ll say, okay, give me the Epoch timestamp, which is right here. So I’ll paste this in, and now we’re saying to Dynamodb, you should expire this item after this timestamp. So in 1 minute from now, to do this, we go to overview and we actually edit this table. So where is it? Here. It is time to live.
Attributes and you click on Manage ttl. You specify the name of the ttl attribute. ttl is a good name for it. And because we have Dynamodb Streams enabled, then whenever this item will be deleted, then it will get to Delete event in Dynamodb Stream. And we can run a preview on this and we can say, okay, what does expire by 1 hour from now? And it says this item should expire by then. This is perfect. Click on Continue and so on. And now we have a ttl. So what that means is that in about 1 minute or two, dynamodb will automatically remove this table, place a delete in my Dynamodb Stream. My lambda function will receive that delete, by the way, and this change will be replicated to my other table in my other region.
So all these features can work together really, really well. And that could be quite handy. Okay, so that’s all the features I want to show you of Dynamodb. So I think all of those you definitely should be aware of. So, to summarize, everything we saw, we saw how to choose keys. And for this, there was a blog right here. We saw how to choose lsi and gsi. So indexes can be created only at lunchtime for lsi, and then you can create them over time. For gsi, we’ve seen around capacity calculations, so we can see about do we want to have recapacity units, write capacity units and so on. And you need to look at the calculation formulas, but you should know them.
So you type Rcuwcu formula and you’ll get this directly from the dynamodb documentation around how to compute those. Then you get dynamodb dax. So dax is going to allow us to set up a cage in front of our dynamodb table and ensure that we can catch the hot items then we’ve seen around streams and how we can get a stream of all the events happening to our dynamodb table and put this at a lambda function. And that lambda function, we can only have three, we can only have at most two processes reading from that stream. So maybe two lambda functions at most.
We’ve seen how streams can enable global tables. So global tables allow for replication across multiple regions. And we’ve seen how that replication is active. Active. That means that whenever I do a change in one region, it will be replicated directly over to the other region and so on. We’ve seen around ttl. So time to live, to expire items after a certain amount of time, and that’s all we’ve seen. So this is perfect. This is just a quick overview of dynamodb, but to me, everything of this should know. This was a quick refresher and I will see you in the next lecture. Let’s talk about dynamodb patterns. All right, so see you in the next lecture.
- DynamoDB – Patterns
So just two very common patterns for dynamodb that can come up at the exam. The first one is around building a metadata index for s three. So, for example, say we write a lot of the objects into Amazon s three, and we want lambda functions to react to this s three write using s three events. And so the lambda function will be triggered. And what this lambda function will do will write the object metadata into a dynamodb table. So the dynamodb table right here will contain the file name, the date, the size of the file, who created it, who used it, and so on. So all this little serverless workflow will allow us to have metadata of the objects in a three directly into dynamodb table. Why would you do this? Well, because we can create an api for this object metadata.
And thanks to this api, we can answer questions such as, can I find an object by specific dates? Can I look at the total storage used by a specific customer? Can I list all the objects with certain attributes? Or can I find all the objects uploaded within a date range? Those are the kind of things you can build on top of the dynamodb table. And so it does give you some kind of metadata index capability that Amazon is three by itself doesn’t provide you out of the box. Okay, so this is quite a common pattern. The other pattern we’ve seen, which is to use dynamodb with elasticsearch. So dynamodb has an api to your retrieve item, and then by integrating it with dynamodb stream lambda function and Amazon elasticsearch, we are able to build an api to search items.
Okay, so these are just two very common patterns for dynamodb. Obviously, there are many others, but I think going to examine these two are the ones you should remember. Okay, so that’s it for this lecture. And then finally, make sure to delete all the dynamodb tables that have been created. So you just right click on the table sorry. And you click on delete table and delete all the cloudwatch alarms and so on. And they should delete as well, the global tables. But just make sure that this has been done.
So if you go into dynamodb, this one, just make sure to also delete the global table just in case. All right, that’s it for this lecture. And dynamodb, I will see you in the next lecture.
- S3 – Review
So just a quick lecture on S Three just to make sure we are on the same page. But S Three is again something you should master already so hopefully you don’t need to know and learn anything new. So we’ve seen that there were events for S Three buckets, so we have events in here and we are able to add events for all these things. And these S Three events can send a notification either to an sns topic, an Sqsq or a lambda function. And we’ve seen that events in this course were different than Cloud Watch events. So S Three events are done at the S Three level and you don’t need to enable anything specific, you just need to say what events you want to have. For example, put Post and copy, who knows? And then the targets will be triggered automatically.
If you use Cloud Watch events oppositely, then you need to make sure you enable Cloud Trail for your S Three buckets before you have this integration working. Okay? Then we have bucket policies. We’ve seen the Bucket Policy at length in this course and we see that everything can be included in here to allow other alias services to access this bucket. For example, Cloud Trail here has access to get bucket xcl. And so this is the kind of things that we can do to enable other A list services to write to our S Three buckets. Or we can also have an Im permission and revolt to allow access to the bucket. The Bucket Policy and the Im role goes hand in hand together.
And the union of the Bucket Policy and the Im role is what defined the access rules for your S Three buckets. Okay. Next we have Cross Region replications. So it is definitely possible to replicate this entire bucket into another bucket. So to do so, we will go to management and in management we can define replication and we can add a rule and that rule can set the source to be the entire bucket or a specific prefix or tags and then we can say whether or not we want to replicate the objects encrypted with kms. So we’ll say okay, entire bucket, we would don’t replicate anything then the destination buckets. So you have to choose a bucket either in this account you can create a new bucket or in another account.
So we’ll say this account, I’ll call it devops chorus, define and replica. Okay? And I’ll just create that new bucket. So I’ll put this here. Here’s the bucket name. Now for the region. The really cool thing is that now you can have Same Region Replication. This is very new, but Cross Region Replication has been there forever. So the exam should only test you on Cross Region Replication for now. But maybe one day we’ll test you on the Same Region Replication. So for example, say I want to replicate my data into Sydney, OK, whether or not we want to change the storage class for the replicated objects, this is fine. And whether or not we want to change the ownership.
So yes, we’ll say, okay, the ownership could change, but right now we’ll uncheck this. We’ll click on Next and then we’ll need to select an im role for this. We’ll create a new role and then the rule name, I’ll call it Replication to Sydney and it’s going to be enabled and done. And so we have created a new bucket and we have done a replication rule to copy the data all the way over to Sydney. And this is an asynchronous type of replication that will happen for us. Okay, next we have lifecycle policy. So if you go to Lifecycle, you can add Lifecycle Rule. And I’ll just call it Rule for Buckets. And you can have a prefix. But right now I’m not going to set a prefix.
And then what I’ll say is that, okay, you can transition for current versions and previous versions of your object, and you get a transition and say, okay, you need to transition into a glacier after maybe 90 days after creation. And I say, okay, I’ll acknowledge that this rule will increase some cost and so on. So this is something you can do. And thanks to these transitions, we can transition objects into different tiers. For example, standard ia for infrequent Access or Intelligent tiering for after or ones on ia and so on. And the reason we do this is to save some cost and make sure we have the optimal pricing. So it’s time I’m going to transition to Intelligent Tiering after 90 days.
Click on next. We could also expire objects saying, okay, not 455 days after the object creation, please delete it, and so on, and also clean up and complete multipart uploads after seven days. Okay, why not click on Next and save? And we have created our first Lifecycle Rule, but you can have many different lifecycle rules on specific folders within your bucket and so on. Okay, we’ve seen that Glacier is definitely a way to archive objects. So if we take this object, I’m not sure what it is, but I’ll right click and I’ll say, okay, the storage class is going to be Glacier, so I’m going to click on it Storage class and I’m going to say, you go into Glacier.
And so that means that this data is going to be archived, and that means that retrieving it can take minutes to hours. Okay? And the reason I would do this is because maybe I want this data to be archived and I don’t want to pay for it as much as I would pay if it was in the standard access. So when you put data into Glacier, it’s automatically encrypted as well. So this is something you should know. Okay, we have s three encryption and we’ve seen s three encryption at length. So if we go into the encryption itself. oops, let’s cancel this. Click on the object again and click on Encryption. We can use no encryption. We can use aes 256 that will use Amazon s three server side encryption to encrypt your data.
You can use kms and you can select your own kms key, for example, my kms key. Or you could have client side encryption. And client side encryption means that you are as. The clients can encrypt the data. Or you can have ssec, which is you provide the key to aws to encrypt the data, but the device doesn’t remember the key. And this can only be done using the sdk. Okay, so the two most popular type of encryption you’ll have obviously will be aes 56 for Amazon S three and eight of its kms for kms encryption. Okay? And then we also have access logs. So let’s go back into this and we’ll go to management sorry, properties. And we can do server access logging and enable access logs.
And we can choose a target bucket and so on for just a random one. And this will say, okay, whenever someone retrieves an object or does something on my bucket, I want to log all these things into another bucket. And this log can be maybe queried using athena to visualize the data and make sure that no one is doing some bad requests or whatever we want. Okay, and this is server access logging. So, overall, just a quick overview of everything in S Three. But I think you know s three by now. And you should know everything that I’ve said. If you don’t, please investigate. Or again, look at the developer course where you have all this information at length. All right, that’s it for this lecture. I will see you in the next lecture.