Amazon AWS Certified Data Analytics Specialty – Domain 6: Security Part 2
- Cloud HSM Overview
Let’s talk about another way to perform encryption on your cloud, and that is going to be different than Kms. This one is called Cloud HSM. So with Kms, AWS is the one that manages the software for the encryption. But with cloud HSM, AWS will just provision you the encryption hardware and you have to use your own client to perform the encryption. So HSM is a dedicated hardware to you. HSM really means hardware Security module. The idea is that you give you a hardware, a physical thing within Amazon, and you manage all the rest. So you manage your own encryption keys entirely, not AWS. And AWS does not have access to your key or cannot even recover them. The HSM device is temporary resistant, so no one from the AWS team can touch your device.
And it has very high level of compliance. The HSM clusters can be spread across multiple availability zones, so it’s highly available, but you must set this up as well. It supports both symmetric and asymmetric encryption, which can be really helpful if you want to generate some SSL or TLS key, there’s no feature available. And to use cloud HSM, you must use your own cloud HSM client software. So, example of services that do integrate with cloud HSM, for example, redshift have support for cloud HSM for database encryption and key management.
Or cloud HSM is also a really good idea if you choose to use SSEC as an encryption mechanism for S three. And then in this case, the key management software will be within your HSM. So, just to get an idea of HSM, this is a diagram. So as you can see here, AWS only manages the hardware. And you cannot, if AWS cannot recover your keys if you lose your credentials. So really, if you decide to go with HSM, you need to make sure that you manage your keys and have a very strong way in place to not lose that keys. And then your cloud HSM client will have a connection to your HSM device to get right and generate some keys. So what is the Im permissions then?
How are they helpful? Or for Im, you can create, read, update and delete an HSM cluster, but you cannot manage the keys within. To manage the keys within, you need to use the cloud HSM software that is not within the console and that allows you to manage the key and manage the users, but that is not regulated by IAM. This is really up to you to set up your own security within your cloud HSM. So I hope that helps and I will see you in the next lecture.
- AWS Services Security Deep Dive (1/3)
So let’s do a review of security in all the AWS services that we’ve seen so far. This is going to be long and boring, but trust me, pretty much needed for the exam. So let’s get started. First, Kinesis is comprised of Kinesis data streams, and Kinesis data streams has SSL endpoints. So we can use the Https protocol and do encryption in flight. That means sending the data to S to Kinesis securely. There is also Kms integrated to provide service side encryption, and that gives us encryption at rest. And on top of it. We can do client side encryption, but we must use our own encryption libraries.
There’s no support for it in the Kinesis service, and we need to use the KPL, the producer library, to do so. But we need to also provide the encryption on our own. There is a supported interface for VPC endpoints privately. That means that we can access the Kinesis service within our private EC two instances. Privately, you can use KCl to read from Kinesis. But remember, if you do use KCl to reach from Kinesis streams, then you must also grant, read and write access to a DynamoDB table. Why? Because that KCl will use that DynamoDB table to do check pointing and sharing the work between different KCl instances. So remember that. Now, for Kinesis data Firehose well, we attach im roles so we can deliver data to S Three, elastic search redshift and Spelunk.
Hopefully remember these four destinations. And on top of it, the entire delivery stream can be encrypted using Kms, that is, server side encryption. There is also support for VPC endpoints and private links to access privately. Finally, for Kinesis data analytics, we can attach an IAM role to it so you can read from the Kinesis data streams we need and reference the sources, maybe for reference data. And write an output to an output destination. For example, the output destination maybe either a Kinesis streams or a Kinesis firehose. So that is it for all the security on Kinesis. Let’s get to the Next technology. The next technology is SQS. SQS will give us encryption in flight using the Https endpoint so we can transfer data to SQS securely.
We will also have serverside encryption using Kms, and you must set an Im policy to allow the usage of KF SQS. There’s also a second level of security we can set using SQS Q access policies and ensuring that our users do have access to the SQS service. This is very similar to, say, an S three bucket policy. If you want to do client side encryption as usual, you must do it manually and you must do it yourself. This is not something that’s supported directly by the service, and you get a VPC endpoint that is provided for an interface if you wanted to access the SQS service securely. All right, next next is AWS IoT. IoT has many different securities. I don’t know if you remember, but the first one is policies. Basically, we are going to create X 509 certificates or cooking to identities into our devices. And using the IoT policies, we can control everything.
We’re able to revoke any device at any time. And the IoT policies are JSON documents, just like I am. Policies, they can be attached to groups instead of individual things. So we can basically group the things together and just manage groups policies instead of things policies. So that was for things security. But then we can get IAM policies to look at the access of the users groups or roles within the IoT service. So it can be IAM policies attached to user group enrolls, as I said. And then you can basically use that to control access to the IoT APIs at a high level. For example, creating a rule or creating a rule action, that kind of stuff. And then you attach roles to the rules engine so that they can perform their actions.
So if your rules engine action is sending data to Kinesis, you need to attach an im role to that rule, basically, so it can perform its action and send data actually to Kinesis. Okay, now, S Three, we’ve seen this so many times, the security of S Three. But one more time, we get Im policies. S Three bucket policies access control list, encryption in flight using Https, encryption at rest of many different kinds. We get service at encryption SSEs Three, SSE, Kms SSEC. We get clientside encryption such as Amazon, S Three encryption clients. And then we get versioning and MFA delete basically to make sure the data doesn’t get deleted by mistake. We can get cores for protecting websites and making sure only a few websites get access to our S Three bucket. We get VPC endpoint that is provided through a gateway endpoint, basically to access S Three securely from my private subnet. And we have glacier.
I just included Glacier in there that has a bunch of stuff, but including these things called lock policies that are very helpful if you want to prevent delete, for example, for regulatory reasons. And that’s called also a warm policy. Write once, read many. Then we have DynamoDB. And DynamoDB data will be encrypted in transit using TLS. So Https and DynamoDB can be encrypted at rest using Kms, basically for base tables and secondary indexes. But that’s only for new tables. Okay? If you have an unencrypted tables, you need to create a new table and then you copy the data to it. So this is not something you can enable in place. You have to basically migrate a table from unencrypted to encrypted to make sure it works. Encryption cannot be disabled once you enable it.
So this is a setting that you just have when you create a table, basically. Then you can control the access to the tables, the API or a dynamic Mmodb dyke cluster using Im policies. And the DynamoDB streams, currently they do not support encryption, but I’m sure they will in the near future. Finally, there’s a VPC endpoint provided to DynamoDB using a gateway to allow your EC two instances or whatever in your private subnets to access DynamoDB directly. All right, see you in the next part. For more technologies overview on security.
- AWS Services Security Deep Dive (2/3)
Next up is RDS. So what do we get through RDS? Well, RDS databases are deployed within your VPC, so that will provide you with network isolation. Then you can use security groups attached to your RDS instances, and that will control the access to database instances on specific ports from specific IPS ranges such as Cider or other security groups. Also, you can use Kms to provide encryption at rest of the data in RDS. And you can use SSL for your JDBC connection to provide encryption in flight when you talk to your RDS. Database Im policies do not provide protection from within the database to provide protection to the RDS API itself. And I Am authentication is a means of authentication that’s supported by only two technologies, PostgreSQL and MySQL. So you must manage user permissions within the database itself. So that doesn’t make it as good as DynamoDB in terms of integration with IAM.
So once your users are created either using IAM authentication or just manual users through the Postgres in MySQL API, then within your database you’re supposed to say, this user has access to this table. But this is not something you manage through Ian. This is something that’s managed from within the database technology. There’s also this little known fact that Microsoft SQL Server and Oracle support something called TDE called Transparent Data Encryption, and it can be enabled on top of Kms and to provide more encryption on your data even more. All right, so that’s it for RDS. Hope that makes sense for you. Hope that’s not anything new for Aurora. Well, it’s actually very similar to RDS. We get VPC to get network isolation, security groups to control who has access to our database, kms to get encryption at rest, SSL to get encryption in flights.
I am authentication for Postgres and MySQL, and we must manage the user permissions within the database itself. Remember, Aura does not support anything such as Oracle or Microsoft SQL Server. Right now, it’s only Postgres and MySQL in terms of compatible APIs. All right, leaving the database. Well, let’s go into lambda. So, Lambda, how does that work? There’s a lot of security around it. First of all, your Lambda function must have an IAM role attached to it. So each Lambda function will have its own IAM role, which will define the permissions of what it’s able to do.
Well, why do you need permissions? Well, because Lambda will pull data from sources and will send data to targets. So basically, the Im roles attached to your Lambda functions will help you decide what your Lambda function can do in terms of sources and targets. Lambda has integration with Kms for encryption for secrets. So for example, if you want to pass the secrets to your Lambda function, you could use Kms encrypted variables and run the decrypt function within your area’s Lambda function, provided it has the right IAM roles. You could also use the SSM parameter store to store configurations in your Edit lambda function and even encrypt the secrets within SSM. With Kms. It has integration with Cloud Watch Logs. Again, make sure that the IAM role is correct, and you can deploy your axis and the functions within your VPC in case you need to access resources from within your VPC. So we saw before that RDS is deployed within your VPC. If you need a lambda function to basically interact with your RDS database, you would deploy your lambda function directly into your VPC privately.
All right, now we have Glue. So to control access to the glue service, you can use Im policies for that. So this is like any other AWS service. And you can also configure Glue to access your databases with GDBC only by using SSL. So that means that you want the encryption to be enabled while the connection from Glue to your database is being made. Okay, so that means that your data in flight is being encrypted. The data catalog is very important to understand this and remember it from the security standpoint. First of all, you can encrypt the data catalog by Kms. So that means you make sure that your data is encrypted at rest. But also you can create resource policies that can protect the data catalog resources. So this is very similar to an S Three bucket policy. So when we have an S Three bucket policy, our S Three bucket is protected against unwanted access. And we can define that in the bucket policy.
We have the same concept with the data catalog resources, we can create resource policies to protect these resources. And that’s something to remember going into the exam. The connection password can be encrypted by Kms for hence security. And finally, any data written by Glue can also be encrypted. So we have a security configuration for the S Three encryption mode. So we can have serverside encryption S Three or server side encryption Kms. We can enable any Cloud Watch encryption for the logs. And finally, the Job BOOKMARKS can be encrypted as well. So remember, going into Glue, a lot of things can be encrypted and the data catalog can be protected using a resource policy. And finally, Glue can be configured to access your databases only through SSL to maintain encryption in flights. So that’s it for this part two of the Airways security. I will see you in the next lecture for part three.