Amazon AWS SysOps – Databases for SysOps
- Section Intro
So we’ve seen how to store data in s three, but what if you want to store data in a database? Well, this is a section in which we’ll look at RDS, ElastiCache and aurora. Now, you should already have heard about these three technologies and I hope know at least about RDS. So this section will be dedicated to going a little bit more beyond what you know to learn about all the systops related questions and triggers. So database section will include good RDS in depth. I’ve just included a few basic lecture just to refresh your memory in case it’s old. And then we’ll go hands on into multiaz versus read replica. We’ll look at paramedic groups, we’ll look at backups and snapshots, we’ll look at security APIs that you should know about. And we’ll look at performance in depth using Cloud Watch and performance insights. This is really important because from a sysaps perspective, the example lot of questions talking to you about the RDS metrics men will learn about aurora in depth. Not just an introduction, but we’ll do a long hands-on on it. And finally, ElastiCache is still a loyal topic at the exam. So we’ll just do a quick refresher on it. I hope you’re excited and I will see you in the next lecture.
- RDS Overview
Now let’s learn about the database service of AWS called RDS. RDS stands for Relational Database Service, and it’s basically a managed database service so that your DB is managed by WS, and you can use SQL or SQL as the query language. Basically, it allows you to create databases in the cloud and manage by WS. So you don’t have to do much. The databases you can create are Postures, Oracle, MySQL, Mario dB, Oracle, Microsoft, SQL Server and Aurora. Aurora, you may have heard of it, it’s basically AWS’s proprietary database technology. Okay, we’ll do a deep dive into what that means. So why would you use RDS versus, I don’t know, maybe deploying your own database on EC Two with maybe EBS? Well, it’s a managed service, so you get a lot out of it. You get OS patching, you get continuous backup and restore, and being able to restore to specific timestamps, you get Monitoring Dashboards, you get Read Replicas if you want to improve the read performance.
You get Multi Availability Zone Setups for Disaster recovery, you get Maintenance Windows for upgrades, and you get scaling capability if you want to scale it vertically or horizontally. And the only drawback is that you can’t really SSH into your RDS instances. Just you understand that Amazon manages them for you, so you don’t interact with them directly. So honestly, I would not think about it twice. If I have to use a database on Amazon, I will use RDS. I will not set up my own database on EC Two because of all these good things we just saw right here. So what I want to do is just give you a little bit of deep dive into a few features, read Replicas, multiaz backup, et cetera, et cetera, just to get a better understanding of how things work.
So read replicas for read scalability is basically the idea. You have your application and you have your Rdsdb instance, and this is how the writes and the read happen. So when you first set up RDS, this is what happens. You just have one application or multiple instance of your application, and one database instance, and all the reads and all the rights go to there. But sometimes after a while, maybe you realize that your application needs to read a lot from your database, and so you need to increase your read scalability. So what happens is that you can create read Replicas, and using this feature, you can create up to five read Replicas. These read replicas can be within the same availability zone or cross availability zone or cross region. So you can create an RDS instance Replica right here, one right here. And it turns out that to replicate the data between the Master instance and the Replica instance, it is Asynchronous. So Asynchronous means that the reads will be eventually consistent. So that means that when you write to the Master, there will be a small lag, a small latency, a small delay between when the write was happening in the Master and when it will be replicated to the Replicas.
Okay? So this is something you should know. The replication is asynchronous. The Replicas themselves really help with the read capability. So your application can now scale and reads. And the Replicas, if you wanted to, could be promoted to their own database. It’s a very quick and easy way to replicate a database and work with something on its own. The application now something you should know, and it sometimes asks in the exam, basically, the application need to change the connection string to leverage the read Replicas. Okay? It’s very important. Now, out of this setup, you will get three connection strings, one for the Master and two for the Replicas. And so your application needs to be aware of it. So read Replicas if you remember two things, it is multiple instances. One only one, the Master takes the rights, all the other one takes the reads.
And basically the replication is asynchronous. Okay? Now this is something that you have to contrast with multiaz. And multiaz is for disaster recovery. So multiaz is basically your application still talks to the Master database instance. Maybe it will be in availabilities on A and it’ll do all the reads and the rights. And it turns out that this Master DB will have this time a synchronous replication to one DB instance that is called the Standby, hence the name S, and that instance will be in A, ZB, okay?
Or C or whatever he wants. And so B, basically the idea is that only one DNS name will be exposed to your application and that DNS name will have an automatic failover from the Master to the Standby in case there are any issues. And why would you do this? Well, it’s to increase availability. So if a whole instance goes down, if a whole loss of an AZ go down, or network or instance or storage has a failure, basically our application can fail over from the Master to the Standby and the Standby will be promoted master.
Okay? And for this, there is no manual intervention that needs to happen in your applications. It will happen seamlessly. And so the only thing why you would use multiaz is for disaster recovery. As you can see, our application still reads and writes to only one database. And to make sure that you can fail over seamlessly, the replication is synchronous. Okay?
So this is not something you use for scaling. This is something you use for disaster recovery. So this is really the difference between the read Replicas, which is used for read scaling, and the multiaz, which is used for disaster recovery. And you can use obviously, a combination of both. Now, backups backups are automatically enabled in RDS and they’re automatic backups. So there is a daily full snapshot of database. So every night there will be a full snapshot of database to make sure we know where it is and then all the transaction logs are captured in real time.
So that using that transaction log you are able to restore to any point of time your database. And that’s quite a nice feature. All these backups are retained for seven days, but you can increase it to 35 days. And that’s all for referred to automated backups. And so this is just enabled for you, you don’t have to set up anything. And the really good thing about it is that you are able to roll back to say, 02:00 P. m. Yesterday very very quickly and being able to save your applications in case of problems. Now, you can also trigger manual database snapshots and they’re manually triggered by the user and the retention is for as long as you want. So you would do this if you need to know that there is a state, a very particular state that you need to save, or you want a long term backup for whatever reason, you would use database snapshots. Now, on the security side, super important and the exam really loves security. So you need to be aware of all the security stuff. So the encryption is at rest and again we see Kms being mentioned. So there’s encryption rest capability, you have to enable it using Kms and it’ll give you the AES 256 encryption. You can also use SSL certificates to encrypt data to RDS in flight.
Okay? So that’s the contrast. At rest means on disk, inflight means over the network and so to enforce SSL, and that’s quite a popular question. For postgres, there is one parameter we can set in the RDS console in parameters group and it’s called RDS Force. Underscore SSL equals one. That’s for postgres SQL and for MySQL within the database you have to run this statement and it’s quite a complicated statement, but what you notice is that at the end it says require SSL. So that’s to enforce SSL, once you run the statements, basically no user will be able to connect to MySQL unless on SSL connection. And so that’s quite a popular question.
Now, if you want to connect using SSL, how do you do it? Well, you provide the SSL trust certificates that you can download from the alias website, and then you provide the SSL option when you connect your database. And that’s about it. It’s quite simple, but you need to know the difference between enforcing SSL and connecting using SSL. Okay? Connecting using SSL doesn’t mean that SSL is necessarily enforced. So knowing how to enforce SSL is something you should know and make note of.
Now, for security, RDS databases are usually deployed within a private subnet so that you don’t expose them publicly. And RDS security works by leveraging security groups, as we’ll see. So basically, security groups allow to understand who like which IP or which other security group can communicate with RDS. In terms of who can manage RDS for this we’ll use Im policy and for connecting and logging into the RDS database. So actually making operations within the RDS database, we’ll use traditional username and password, or you can also integrate it with Im users. This is new you for MySQL or Aura. Okay, something you should know about.
Now talking Aurora, just the last bit you should know about RDS. Aurora is basically a proprietary technology from AWS and it’s not open source, so we don’t know how it works. But basically Aurora is compatible with Postgres and MySQL, so they’re both supported. And the idea with Aura is that it’s cloud optimized. And so Amazon claims there is a five x performance improvement over MySQL or a three x performance improvement over Postgres. And it comes with incremental storage. So you can increment the storage over time from 10GB up to 64 terabytes. It can have 15 Replicas with MySQL and you can do replication faster. Like it says, sub ten millisecond Replica lag.
And also there’s failover, so it’s mulitas by default. It’s high availability native, so your application will failover directly. And although Aurora costs more than RDS a bit more, it’s said to be more efficient. And so overall your cost should decrease with this Aurora database. Basically, Amazon is trying to push a technology that’s going to be a bit less painful to set up, a bit more scalable, and a bit it more cloud friendly. So there’s not that many questions about Aura in the exam yet, but you should still know about it. And what’s the difference between RDS and Aura? Aurora is going to be? The proprietary technology from AWS comes with a lot of goodies, whereas if you want to set up Postgres or My, SQL and RDS, then you have to set up the read Replica and the multiaz as options if you wanted to. Okay, so that’s all for RDS. In the next lecture, we’ll go ahead with the labs.
- RDS Hands On
Okay, now we are going to play with RDS. So let’s go to the RDS service and in there I am going to launch a service. So we have a launch database instance that is big here because we have not done it before. And so we get the engine options. As you can see, we have six different engines available to us and so we can choose whatever we want. If you want something that is compliant with the free tier though, you have to scroll down and click on only enable options eligible for RDS free tier usage. So first of all, I want to show you a few things. Aurora is not free tier eligible and if you click on Aurora, you can choose between a MySQL five six compatible, MySQL five seven compatible, or postgres compatible. So it’s a decision you have to make with Aurora.
But because we’re in a free tier, we have to deal with one of these. The most popular one at the exam is going to be MySQL. So I’ll just create MySQL But Postgres I would say is the second most popular database in there. So I’ll go with MySQL and MySQL is free tier eligible. So I’ll click on next. Here I can choose the license model, the DB engine, I’ll just leave it as is. And for the RDS free tier, it says it has to be a DB T two micro and it has to have 20GB of storage. So we’ll just go along with this. We’ll say DB is T two micro and as we can see, because we are in the free tier, we cannot set up a Multiac deployment. But if we wanted to have a multi AZ deployments, we could do this. And this is to basically provide high availability. The storage type is going to be SSD and we are going to allocate 20GB just because it’s what the free tier allows us to do.
Now we have to define a DB instance Identifier, and that is for the DB to be unique across all the DB that we own for our address accounts. So we’ll just call it My first DB. We’ll say my first MySQL because it’s my SQL database. And then you have to say a master username. So I’ll just say Stefan and then the password you want. So you just put whatever you want as a password. Just make sure that you remember what it is. So My username is defined and my password is password. Click on Next and then I have my virtual private cloud. So here is I will say you go into my default VPC and for the subnet group we’ll use a default one. Now we have to set the public accessibility.
And so basically we want to say are we able to connect to our RDS instance from outside our VPC? So if I was doing a production database, I would say no and only allow my EC two instances to connect to my RDS database, but because I’m doing a demo here and I want to connect from my computer to the RDS database, I will say yes. And that’s very important to just notice. Now for the AZ zone, we can say a preference where we want it to be, but so far we have no preference. And then for the VPC security group, we’ll just create a new one which will come with the right options, by the way. Now the database name, you can say whatever we want. If no database name is specified, then no initial my SQL database will be created. So we’ll just call it My DB and the port is going to be 3306, which is standard for MySQL. And the parameter group is this one, and the option group is this one.
And we’re good. As I said before, now there is a way to authenticate using IAM within the database. But for now we’ll disable it. We’ll just use it using the good old username and password type of thing. Encryption is super important when you do with RDS. Currently we’re in the free tier and we use a T two micro and they don’t really support encryption. But basically if we wanted to enable encryption, this will be for encryption at rest, and as it says here, it will be using Kms. So the key management service, and basically this will ensure that our database data is encrypted at rest. It’s super important. This is just a one click to enable it. And this is quite a popular theme at the exam for backups. As I said, backups are enabled by default and we say, okay, I want a seven day retention for backups. But as I said, we can go all the way to 35 days and we can actually decrease it to zero one day as well. The backup window, you can select a specific time if you wanted to know that you have less load on your database during such times. But for now we’ll say no preference. Finally, we can enhance the monitoring, but for now just leave it as disabled because we don’t need it, and you can export all the logs, et cetera, et cetera.
So, as you can see, there’s a lot of options happening here, but you can have to go through this list. They’re quite intuitive, each of them, and they’re well described, but so far the defaults work just fine for us. Now, as I said, because RDS is a managed service, we’re able to have maintenance windows, and basically Amazon can upgrade for us some minor version upgrades. So we get security patches going on by Amazon directly onto our RDS. We don’t have to do anything. And again, when we do a maintenance, we can define a maintenance window and we can say a specific time when we like the maintenance to happen, otherwise we say no preference.
When we’re ready, we click on Create Database and at the end of this, our database will be creating. As you can see, it says your instance may take a few minutes to launch. And that’s true, it usually takes quite a long time. So what I’ll go is stop the video and get it back when the database has been created. So while my instance is still being created, I want to show you a little SQL program I like, which is called SQL Electron. And SQL Electron is basically a DB client for your database. And so it works for Linux, mac and Windows. That’s why I like it. And it gives you a Gui, so a graphical user interface to connect to your database. So I invite you to download the GUI and then install it on your computer. Basically you go onto this GitHub page and here if you’re on Mac, you use the DMG. If you’re on Windows, you use the Wind package, and if you’re on Linux, you can use the Sh package or the DBN and Rpm.
So once you download it and install it, after this, my SQL Electron is installed and I click it add in here, I can add a connection so I can connect to my RDS database and just a name for it. And I created a MySQL database now to put the server address, the port and so on. So for this we can go back to our page and now we can see that the instance is backing up. So it’s been created. And if we scroll down all the way down, we can see a connect block with the endpoint. And the endpoint is basically the URL I need to use to connect to my database. And the port is 330 six, and it’s publicly accessible, so I should be able to access it from within my computer. So let’s hope for this, I copy this and put the server address right here, and then the user name is going to be Stefan and the password is going to be Password. So it looks good. Now we’d be able to enable SSL if we wanted to be having a secure connection, but for now we’re good. So we just click on Test and the connection test set is successfully connected. Happy days. I save it and I connect to My database.
And here I am, I am in My database and we can see that it is my DB on the left hand side. This is the database that was created first for us. You don’t need to be a SQL expert to use MySQL or database, but this is not a course about SQL anyway. So as soon as we connected, we know everything worked. And that’s about it for the hands on. Okay. What you should know about is that all the connections and options you had when creating this RDS database, the fact that there’s a security group that was created for your instance, and you can click on it to go and see the details of your security group so we can see that it authorizes my IP on the port 3306, which is good. And you can see all the network and security configuration if you have backup, if you’re multiaz, et cetera, et cetera. So this is quite nice.
The last thing you can do is do instance actions and as you can see on the right hand side, we could stop it, reboot it and create a read replica. So if you wanted to create a read replica, we could just click here and have a second database that we can connect to just for reads. We can also take a snapshot to basically take a backup on ourself and then restore to point in time using the backup and migrate a snapshot if you wanted to move the snapshot away into another region. So I hope that was helpful. That was a quick overview of RDS, but we’re able to quit a database MySQL connect Ting using SQL electron and basically this is a database that’s available for our application to be used if they need to have one. Okay, so that was it. Hope you enjoyed it and I will see you in the next lecture.
- RDS Multi AZ vs Read Replicas
All right, a lecture to outline the differences between multiaz and rereplicas. So in the exam, you’re going to have so many questions asking you between the difference for multiaz and rereplicas, and in which case you should use one or the other. So you should know it by heart. Obviously, I’m going to repeat myself, this is what I said in the previous lectures, but it is super important for me to repeat myself because I really want you to know and master what is the difference between the two.
And you’ll see, they’re very different. So multiaz, we know exactly how that works. Already, our application reads and writes to the Master database, which lives in maybe Aza. And it turns out that this Master database does synchronous replication to a Standby database in another AZ. And that’s going to be a Standby instance. That instance, the standby, we cannot read and write from it at all. It is just there to do failover in case there is any problem with our Master database and our application. When it interfaces with our database, it talks to one DNS name. And the thing is, if one day failover happens, we will still talk to the same DNS name and our application will talk to the other database. The standby will be promoted to being a Master.
So multiaz is not to scale read, it is just to increase fault tolerance. Now, in depth, basically, as I said, it’s not used to support the reads. And you need to know when it fails over. It fails over only if the primary DB instance fails, there is an AZ outage or the DB instance type instance type is changed or you’re doing some kind of maintenance. Another kind of maintenance you can be doing is also when the OS is going to do software patching or when there is a manual failover by doing a reboot. And so when you do a reboot, the primary DB instance is not being usable. And so there is a failover that happens right away. There won’t be any failover for long running queries or deadlocks or database corruption errors. So watch out for those. And what you need to know is that the endpoint, as I said, the DNS endpoint, the URL does not change after failover.
It is the exact same. And so your application should not have any change, they should just have some logic to reconnect to the database if they lose some connectivity. Now, why would we use muftis on top of fault tolerance? Well, there are some questions which are asking. I want to lower the maintenance impact. So I want to basically make sure that whenever some maintenance happens, it takes less time. And so the idea is that you can use multiaz to reduce that maintenance, because if Amazon does some maintenance on your database, it will do it on the standby first. And when the standby has been upgraded, it will then be promoted to the master and so you have much lower downtime than if you have a normal database without multiaz. Also if you have backups, backups in non multiaz mode are done on the main database.
But if you have a multiaz enabled then the backups are created from the standby and because nothing talks to the standby then it won’t impact the performance of the master database.
And then finally very good to know. Multiaz is only within a single region, not cross region. So if there is an entire region in AWS that goes down then obviously it will impact your availability. So multi ASE you need to master and all of these should make sense to you, especially the last three lower maintenance backups from the standby. Basically multi AZ is used in this case to make sure that all the maintenance work does not happen on the main database and does not impact performance but in no way it is used to improve the read performance. Now, for improving the read performance, we all know there is read Replicas and we’ve seen that diagram before.
Read Replicas means that your application still has the rights to a single master database instance but for the reads it can do it from any database that can be the Master or the read Replicas. What happens though is that the master to read Replica replication is asynchronous. That means that it will append happened eventually, but we don’t know when. And usually very quick. But still there can be some consistency, obviously issues between the Replicas and the master. And so the idea here is that our application using rereplicas can scale the reads tremendously. So we know this and you need to know that diagram, obviously. So read Replicas helps scale read traffic and you can promote a read replicas to be its own standalone database.
So if you had a question around hey, we want to have our own database that has the copy of the master database and we want to have its own lifecycle for the database. Then you could use a read replica for this. You can also create a replica within the same AZ, cross AZ or even cross region so it can be used to have it some DRM mechanism across region. Each read replica will have its own DNS endpoint. And so to scale or to load balance, you need to make sure that your applications are aware of all the read replicas DNS endpoints. And you can do read replicas of read replicas. And to make it even more complicated, read replicas can be multiaz. So the idea with read Replicas is that they’re just duplicates, right clones of your main database.
The read replicas can help with Dr or disaster recovery. As I said, only if you’re using crossregion read replicas because now you’ll have a copy of your data asynchronously in another region and good to know. Read replicas are not supported for Oracle. And if a question at the exam asks you hey, we want to run bi and analytics report on our production data, but we don’t want to impact our production application. How should we do it? Well, you can make a read replica of it and run Bi and analytics report and applications on top of your read replica and you’re for sure not going to impact your main database. So you need to know absolutely for the exam, the difference between mulitas and read replicas by heart. And I hope I made this clear enough so that’s it for this lecture. I will see you in the next one. Bye.