Amazon AWS SysOps – CloudFormation for SysOps Part 3
- CloudFormation cfn-signal failures troubleshooting
One very important question exam you may get is the wait condition did not receive the required number of signals from an Amazon e two instance and you need to be able to troubleshoot this. So number one, it’s maybe because the AMI you’re using has the confirmation helper scripts installed so not installed, so you need to make sure that it has them. Obviously if it doesn’t have the help of scripts, you go online in the documentation and you figure out how to download them to your instance. Then you need to verify that the CFN init and CFN signal commands were successfully run on the instance. And as we see there are logs such as VAR log, cloud init log or varg Cfninit log to help us debug the instance launch. Very important. Then we can retrieve the logs by logging directly into our instance using SSH. But as we’ll see in this lecture, we won’t be able to do so until we prevent a delete instance on failure.
So we’ll see this in the next lecture, but in this lecture we’ll see how things would not work, which will be quite cool. And then finally a very tricky one is that your instance must have a connection to the Internet. And so if it’s in a VPC, basically our instance should have internet connectivity through a Nap device if it’s in a private subnet, or through an internet gateway if it’s in a public subnet. And we’ll see this in details when we do the VPC section. But the idea is that if our instance cannot talk to the Internet, it cannot talk to the confirmation service.
And if it cannot talk to the confirmation service, well, there is no way for the confirmation service to receive a signal and things won’t work. And so for example, if you want to know if you have access to the confirmation service from the instance, you can run a curl command on the AWS Amazon. com. Very simple. So let’s go have a look at how we can trigger a failure for the weight condition.
All right, so in CFN signal failure, YAML, now it has looked like the exact same thing, but the thing that is a failure is that my command hello will say boom and then we’ll exit with the error code one. And so when you do exit one, that means that is a bad exit code, the good exit code is zero one is a bad one. So this will basically trigger a CFN init failure. And so we will signal a failure to cloud formation. And so let’s go see what happens in that case. So we go to our stacks, we create the stack, we’re going to upload our failure stack. So it’s number six.
Here we go. And click on next. Now Azure says CFN signal failure example SSH key still will be the same. And next, and all the way to the bottom create stack. All right, so now we are going to have to wait just a little bit and I’ll pause the video for the EC Two instance to get created and to report a failure. As we can see, my instance is now in Create in progress. So if I go to my EC Two management console, I should be able to see, yes, my instance running. So now we should be calling CFN in it and we should wait for it to happen. And so my simple wait condition again is still in progress. So it’s waiting for a signal, basically. So now I’ll just refresh and wait a little bit.
My instance is now created, so CFN in it should be running within a minute. And here we go. We get a failed to receive one resource signal within the specified duration. So I think that was even a timeout for this one. Not even the CFN signal pinging it. So we fail to receive a signal within the specified duration. So Create failed for the sample weight condition and as a result, my entire CFN signal instance and Stack is going to roll back. Rollback means that my instance is going to be deleted and so will my sample weight condition. So now if I go to my East Management console, my instance is shutting down.
And so the cool thing is that, yes indeed, if CFN signal doesn’t work, or if CFN in it doesn’t work or whatever, then obviously the wait condition fails and we’re happy and we should roll back. But now the little problem is that I can’t debug why it failed because as you can see, well, my instance is shutting down. So that’s a bit of a problem for me, isn’t it? Because I can’t SSH into it. So we’ll see in the next lecture how to do this. But the cool thing is that now we get a sample weight condition that fails. And so we were able to debug why no resource signal was being received. It’s because there was an error in our scripts. So I will see you in the next lecture.
- CloudFormation Rollbacks
So let’s quickly talk about confirmation rollbacks. It’s very important to know how they work in case they appear at the exam. So if the stack creation fails, so if you upload a stack and the stack creation fails, by default, everything will roll back. That means will get deleted. And so we can look at the log to understand what happens. But when you create the Stack, you also have the option to disable the rollback in order to troubleshoot what happened and get a bit more insight into what was created. You update a stack. So it was already created and it’s successful, and now you update it. If the update fails, the stack will automatically roll back to the previous known working state, which is the green state you just wanted to update. And you get the ability in the log to see what happened thanks to error messages. So this lecture is all about showing you how rollback works. Okay? So in this example, we are going to show how rollbacks work.
But first, let’s go and create a stack. And we’ll use a template that we’ll upload and we’ll first use the zero just EC two template to get started with the basis that we know works. And I’ll call the stack failure demo. I will click on Next and then Next, and finally I will create my stack. So this will create an easy to instance and then I will wait for the stack to be completed in the create state. So my stack has now been created and I’m going to update it and I’m going to replace the current template, upload a new template, and this time I will choose the trigger failure YAML. So this file should trigger a failure and I will show you why in a second.
So I will just say hello world as my security group description. Click on Next and then click on Next and finally click on Update Stack. So first we just see the change set and as we’ll see in a second, the change set is exactly the same as before. So if an elastic IP security groups and the easy to instance should be replaced, but I’ve included something in my template that will trigger a failure. So we are an update in progress, but let’s go see in the template what the problem is. So if I go to two trigger failure YAML, and I scroll down here in my image ID, I say the AMI is 123456 and this AMI in this region, US East one, does not exist and therefore it will throw a confirmation error. So what I want to show you is what happens in confirmation when this error happens.
So if we refresh this and we go and have a look at the updates, the update is in progress. So the SSH security groups were created. So these were new SSH security groups. Then cloud formation tried to update my instance but then it failed and it gave me a reason that the AMI ID 123456 does not exist because it was not found. And that makes sense. So therefore there was an update rollback in progress and my EC two instances have been rolled back.
Now if I refresh this again, we are in the more states, so let’s go and see what happens. So we are in the update rollback in progress and then anything that was created before. So the SSH security groups that were created during this update are also being rolled back. So as we can see, the SSH security group and the server security group are being in delete in progress, then they’re deleted complete and then the stack is an update rollback complete. So in this instance when there is a rollback happening, anything created during the update, if there is any error anywhere else, will just roll back everything else to where it was from before. So it’s good to know.
Now let’s try to create a stack. And this time I will create a stack with new resources and I will use an upload template and I will right away create this to trigger failure. So I will just say failure demo two and here I will just say hello, I will click on Next and so this will trigger a failure right away. And so if I go in the advanced options on the rollback configuration, I can see that, sorry, stack creation option. I will see that there is a rollback on failure option and right now it is enabled. That means that if there is a creation failure, the entire stack should be rolled backed. But if we want to troubleshoot it, we should set it as disabled and then the stack will be stuck and we can troubleshoot based on the resources that have been already created.
So if we leave it disabled, nothing will be rolled backed, but if we leave it as enabled, everything will be rolled backed. So we’ll just create this one, we’ll just say enabled, click on Next and then click on Create Stack and this goes ahead and creates my failure demo two, which should right away give me an error. So I should wait a little bit. My security groups are being created, the create is in progress, but very soon I get my instance create failed and it says this Mmiid does not exist and therefore my security groups are going to be rolled back. So let’s have a look.
My instance is being deleted and then right after my security groups are also being deleted. So now my stack is in rollback complete states and if I look at the resources there is nothing, everything has been deleted and so I cannot travel shoot. So if I wanted to travel shoot, I would have to change the setting, I should just from before and say rollback on failure disabled. So now the cool thing we can see is that we have two stacks, one is an update roll by complete and one is in rollback complete. So the one stack that has been created in that state we cannot update it.
The only thing we can do is delete it. So let’s go and delete it. Whereas the one that is in update role by complete is still working. So I could go ahead and update it if we have a fixed confirmation template or I can go again and delete it and this is what I will be doing. So that’s it for this lecture. I hope you liked it and I will see you in the next lecture.
- CloudFormation Nested Stacks
So let’s talk about confirmation nested stacks because these can happen at the exam. So nested stacks are stacks in other stacks, hence the name nested. They basically allow you to isolate a pattern that you’re going to repeat or maybe common components in two separate stacks and just call them from other stacks. So if you know for example how to configure a load balancer really well and you want to reuse that configuration all across the board for your company, then instead of copying and pasting you would isolate that load balancer into a nested stack and reference that from within your stack.
Maybe it’s a security group, maybe that’s SSH security group that you want to have a specific configuration for it. Maybe you want to isolate this into a nested stack. And so they’re considered best practice and we’ll see how to use them in a second. And so by the way if you wanted to update a nested stack then you will always update the parent first the root stack and then everything will get updated from there. So that’s the idea. They can have very small questions at the exam. So let’s go have a look at how it works in practice. So let’s create a stack and this one will be a template file and that will be the seven nested stack YAML. So before we do this, let’s have a look at what’s inside of that file obviously. So in this one we have an SSH key, this is the key pair we’re going to use to SSH into our instance. And now the cool thing though is that we have a cloud formation stack as a type.
So we have my stack within resources and the type of which is the AWS cloud formation stack and that is enough for you to declare a nested stack. So the properties is that you have to define a template URL and so I use this template URL right here which is the lamp single instance templates. So we could follow this link. So I’ll just copy this link just so you can see what it is. So let’s go have a look in there. And so this is just a template file and this is in JSON form but it doesn’t really matter. And so as you can see this has parameters.
So there’s key name, DB name DB user and all these things are parameters that we have to pass and then there is a mapping that was defined and then at the very bottom resources it will create a web server instance of type EC, two instance and on there it would install using actually cloud formation in it. So here we see again confirmation in it it will install MySQL httpd PHP so a lot more then in the files.
It will basically create an entire HTML file in there which is quite nice. So as you can see cloud formation in it can be used to do some pretty crazy stuff and install goes on and on and on and at the very end we have still services running and some commands to configure everything. So the cool thing is that now this stack, we won’t copy and paste it into our stack, we’re just going to call it as a nested stack. So let’s go have a look here. We’ll just use a template URL and this will automatically pull in that stack for us and we’ll pass in the parameters key name DB name DB user DB password DB root password instance type and finally, when we want to SSH from, the cool thing is that we can reference output of this stack directly into our output value. So we can for example, get an output from a nested stack by doing my stack outputs website URL.
So let’s go have a look because I think when we actually use it, it will make more sense. So here we go, we’re going to click next, I’ll call it Nested stack example and SSH key will be the same as before. Click on Next and click on Create Stack. So here important thing is that within capabilities now it’s saying because you’re using a US confirmation stack, you need to acknowledge that you might create IAM resources and also that you need to have this IAM capability auto expand that you need because you need to expand the nested stack we just referenced. So we’ll just take these two boxes and click on Create stack. Here we go. So now our thing is being initiated.
So the cool thing is that now this stack itself will be created and it’s been created in progress. So if we go back to the stacks now we can see that even though I uploaded only one file, nested stack example, within this thing there was a new stack with a new stack name that is being created and that is the whole file that I just showed you from before, which is right here. And so this stack is as we can see, nested. So it says nested right here and within it we can look at the events. And this is where we see the web server security group, the web server instance, everything being basically created as we speak and we get a description. So all these things are from within the nested template.
So now the only thing we have to do is just wait for the web server instance to configure itself. All right, so it looks like this nested stack has been completely created because the wait condition has received a success signal again. But now if we go back to my parent stack, it’s called the parent stack, the main one, then if we go into resources as we can see, we can get a create complete.
And the events is that as soon as my nested stack has completed being created, then we get a full create complete for the parent stack as well. So the really cool thing is that if we go to outputs now, we get a URL to access our underlying easy to instance, and that came as an output directly from the nested stack. So let’s go have a look. And it says welcome to the A formation PHP sample. And it says I connected to local host successfully.
So that’s pretty awesome. I mean, it’s just pretty basic, but it shows how nested stacks works. And then importantly, you’d never ever touch the nested stack, ever. You don’t update this directly. What you would do is you update the parent stack. And so what we would do is to delete everything. We’ll delete the parent stack, so I’ll delete the parent stack and automatically my nested stack will get deleted as well. So that’s about it. Okay, I hope you enjoyed it and I will see you in the next lecture.
- CloudFormation ChangeSets
All right, so let’s quickly talk about change sets. So when you update a stack, maybe sometimes you want to be sure that the stack will update the way you want it. So maybe you want to have greater confidence. And for this we use change sets and change that will basically know what happens, but it won’t let us know if it will work or not. So it’s not a way to just QA and make sure, yes, there won’t be any failures. Change sets, change that don’t work like this, they just help you understand that some things will change, some things will be deleted, and some things will be created. So let’s take an example. From the documentation, we have our original stack, we’re going to create a change sets and then we’re going to view that change set.
So we’re going to view what gets added changed or create deleted. And then maybe optionally, we can create additional change sets if you want to refine our methodology. And then finally, when we’re ready, we’re going to execute that change set and go get our updated confirmation templates. So that’s the idea though. We’ve actually used change set without knowing beforehand. But here we’re just going to formalize this in this lecture. So for this example, we’ll just use the zero just EC Two and then the one Sgt with Sgeip YAML. So let’s go ahead. First we’re going to create a stack, we’re going to upload our template, and it’s going to be our zero just EC Two. Just like before, in the very, very beginning, I’ll call it Stack CS Demo and there’s no parameters needed.
I’ll go ahead and click on Next, and we’ll go ahead and click on Create Stack. So this will go ahead and create our EC Two instance for us. All right, so our Stack has now been created. And so if I go to change sets on the left hand side, as you can see, there is no change sets right now available. So we can click on here and create a change sets directly for a Stack CS demo. So another way we could do this is go to Stacks, then we click on Stack CS Demo action create change sets for current stacks. So, two ways of doing this now, we can basically say either we reuse the current template and we’ll just change some parameters, or we want to replace the current template with something new, or edit in Designer, we’ll replace the current template with something new and we’ll upload what the new is.
The new is this one. One EC two with Sgeip. So now we want to know what would happen if I were to apply this update to my template. So I’ll just create a description, I’ll call it FUBAR because I want to be quick. Okay, then we click on Next, and here we go. We have specified the template and the parameters is FUBAR for Security group description. And I will go ahead and bottom and say Create change sets. So we can describe it, we can name it if we wanted to. I’ll click on Create Change sets and here we go. Our chance set is now created. So if we go to change set on the left hand side, as we can see, it says Create Complete but it hasn’t been applied yet. Now let’s have a look at the inputs. The inputs is all the parameters and the things we’ve sent it from before.
The template is what we’ve just uploaded and the JSON changes represent what is going to change. So the JSON changes are quite hard to read, to be honest. There’s very, very verbose. But if we go to changes tab and actually scroll at the bottom, we can see that it is going to add a server security group. Add an SSH security group. It’s going to modify my EC two instance and it’s actually going to replace it. So it’s a replacement. True. So we will lose our current EC two instance and a new one will be created and then it will add an EIP.
And so the cool thing about this is that we can see beforehand if we are willing to do this change. Maybe I didn’t want my EC Two instance to get replaced, so maybe I don’t want to apply this. In my case, I don’t mind, so I’ll apply it. But this is why a change set will be helpful. So when you’re ready, you can either delete it or execute it. I will execute it and basically it just triggers an update that we already know. And so I’ll just wait for the update to be complete. But no surprises here. The cool thing though, is that if you go back to change sets now, it will show you the last executed chain set. But there is no new chance set to apply, so you have to create a new one if you wanted to update it. So that’s it. I’ll just wait a little bit it and then delete that stack. But I won’t film that. I hope you understand how transit work now and I will see you in the next lecture.