Google Associate Cloud Engineer – Object Storage in Google Cloud Platform – Cloud Storage part 2
- Step 06 – Understanding Cloud Storage – Versioning
Back in this EP let’s look at object versioning. What is object versioning? Why do we need it? Object versioning prevents accidental deletion and also it provides you a history of what happened with a specific object. If you have an object and if by mistake somebody deletes it, then you lose it. However, if you have versioning turned on for a bucket, even if the object is accidentally deleted it, you can restore it from the previous version. Object versioning is enabled at bucket level. You can turn it on and off at any point in time whenever using object versioning. The live version is the latest version. The latest version which we are making use of is the live version.
If you delete live object then it becomes a non current object version. So if I have a live object, the current active version and I go and delete it, it becomes a non current object version. If you go and delete the non current object version it is completely deleted so it would be deleted from cloud storage. Each of the new versions that we create is identified by the key of the object plus a generation number. So this generation number is like a version number. So your object key plus v one, object key plus v two, object key plus v three and so on and so forth. You can use this identifier to perform any operations with the older versions of an object.
If you try to access using object key you are always playing with the live version. However, if you want to access an older version, you can say object key plus generation number and you can use that to access older versions. One important thing is whenever we have versioning you are storing multiple versions of the same object and therefore your storage needs increase and therefore your cost also goes up. In order to keep your costs low, you can actually delete the older versions. Anything which is non current and let’s say more than 15 days old, you can actually delete them to reduce your costs. In this quick step we looked at how you can actually use versioning for your objects.
- Step 07 – Understanding Cloud Storage – Lifecycle Management
Come back in this step. Let’s look at object lifecycle management. Files are frequently accessed when they are created. Generally, usage reduces with time. How do you save costs by moving files automatically between storage classes? How do you delete files which you don’t need anymore automatically? That’s where we would go for object lifecycle management meant you can identify objects using conditions. These conditions can be based on age created time, whether the object is live or not, whether it’s of a specific storage class. Or you can also look at how many newer versions are present. So you can identify based on a variety of conditions. And you can set multiple conditions only if all these conditions are satisfied, the action happens. What are the actions that can happen? There are two kinds of actions that can happen.
Set storage class is to change the storage class, I would want to move an object from standard to archive if it’s greater than one year old or if it’s greater than 20 days old or 40 days old. So you can do that by using the set storage class to archive. So if the age of an object is greater than 30 days, I would want to set the storage class as archive. The other thing you can also do is deletion actions. So if I know that a specific object is not needed after a year of uploading, then I can say if age is greater than one year, then delete the object. The great thing about object lifecycle management is everything is automated for you. You don’t need to do anything manually. You go in, create a bucket, set the lifecycle management for that specific bucket. That’s it. Whenever you keep uploading objects and as and when objects meet that condition, the specific actions would happen.
Let’s say I upload an object today and on the bucket there is a lifecycle management to delete the object after, let’s say, two months. As soon as two months passes, the object would be automatically deleted. I don’t really need to worry about doing anything about it. One important thing to remember is that there are restrictions. You can only move from standard or multiregional or regional storage classes to Nearline or cold line to archive. From nearline you can move only to cold Line or archive. From cold line you can only move to archive. So these are the only allowed transitions. Let’s look at a quick example of a rule. So this is a lifecycle rule which we are configuring in here. As part of the rule we have an array. We can have a number of action and condition pairs.
If you look at this specific one, the condition is age is 30 and east live true. If age is more than 30 and east live is true, then delete the default is number of days. So this is 30 days. And similar to that, we have another condition which is defined in here. Over here we are configuring anything greater than 365 days and which matches the storage class of standard. Move it to nearline. You can see that the action type here is delete. However, the action type here is set storage class. This is the bucket which we are playing with. My first bucket in 28 minutes. And you can go over to lifecycle. And this is where you can configure the lifecycle rules. You can see over here that lifecycle rules let you apply actions to a bucket objects when certain conditions are met. For example, switching objects to colder storage classes when they reach or pass a certain age. Now, over here you can add the rules.
So I can say add a rule and I can define. So do I want to delete an object or do I want to move it to any of these specific storage classes? Let’s say I would want to set storage class to an airline and I can say continue. And over here you can set a number of conditions. So I can set conditions based on age or created before a specific date or I can say belonging to a specific storage class, number of newer versions, how many days since it became non current and a lot of other conditions. Let’s just say age is 30 days and then I can say continue. I’m not really I’m not going to really create the condition. But this is how you would actually create a object lifecycle rule. You can see that it’s very, very simple. In a step we looked at object lifecycle management allows you to automate and keep your costs low in cloud storage. I’ll see you soon.
- Step 08 – Understanding Cloud Storage – Encryption with KMS
Welcome back. Whenever you want to store some data, you need to always worry about encryption. And when we are storing files to Cloud Storage, what are the different encryption options that we have? That’s what we will be looking at in this specific step. Cloud Storage always encrypts data on the server side whenever we create a cloud storage bucket. The bucket is by default assigned with a Google managed key and that is used to encrypt the data. By default, however, you can configure the server side encryption. The encryption which is performed by Cloud Storage can be configured either to use the Google Managed encryption key. This is the default, and this actually does not need any configuration. You also have the option to use a customer managed encryption key.
You can create a key in Kms Cloud Kms, and you can use that key to encrypt the data in Cloud storage. One important thing to remember is whenever you are using customer managed encryption keys, your cloud storage service account should have access to the keys in Kms. The cloud storage should do automatic encryption and decryption of your object using the keys. And to be able to do that, Cloud Storage needs permissions to the keys. How does Cloud Storage get permission to access the keys? The way you can do that is by actually giving permissions to Cloud Storage service account. So the cloud storage service account needs access to the encryption keys.
In addition to the server side encryption, you can also perform client side encryption. When you’re uploading an object to cloud storage, cloud Storage itself does server side encryption on the server side. However, before sending the object, you can actually encrypt it and then send it to Cloud Storage. So this is additional client side encryption on top of whatever is performed by Cloud Storage. The Google Cloud platform or the Cloud Storage service does not know anything about how you actually did the encryption. The advantage of going for client side encryption is that even while the data is being transmitted to GCP, even on the path, the data is encrypted.
Now, how can you configure server side encryption? Server side encryption can be configured in the bucket in the configuration. So if you open up the bucket and go to configuration, you can see encryption type. The default is Google managed key. You can also use Customer Managed Encryption key to use a key from Kms. You can also actually add encryption at the time of creation of your bucket as well. So when you’re are actually creating a bucket and let’s go into advanced settings, this is where you can actually configure the key to use. You can either use a customer managed encryption key or the Google Managed Encryption Key, which is the default in this step. We talked about encrypting your data which is being upgraded to Cloud Storage. I’ll see you in the next step.
- Step 09 – Scenarios – Cloud Storage
Welcome back. Next up, let’s look at various scenarios related to cloud storage. How do you speed up large uploads? Let’s say I’m uploading 100 GB archive file to cloud storage. What is the recommended approach? The recommended approach is to use parallel composite uploads. The file is broken into small chunks and all of them are uploaded in parallel. You want to permanently store application logs for regulatory reasons, and you don’t expect to access them at all. What would you do? How do you reduce the costs? You can actually use cloud storage, archive storage class, anything you don’t expect to access more than once a year, you can move them to Archive. Log files are stored in cloud storage. You expect to access them once in a quarter. So you expect to access them once every three months. What is the storage class which is recommended? Colon storage class.
How do you change storage class of an existing bucket? In cloud storage, let’s say I have a bucket and the bucket contains, let’s say, thousands of objects, and I would want to change the storage class of the bucket itself. So any new object which is uploaded to the bucket would need to use the new storage class. And I also want to change the storage class of the existing objects. How can I do that? The way I can do that is in two steps. Number one is actually first change the default storage class of the bucket. How can I do that? So if I go over to the bucket, let’s just go back and over here, let’s pick up the my first bucket in 28 minutes. You can go in here and in the configuration, you can change the default storage class of this bucket.
The default storage class will only be applicable for anything that is newly uploaded. The existing objects will not be changed. How can you change existing objects? You need to actually update them individually. You can go in and update the storage class of all the objects in the bucket. You can actually do this from command line as well. So the two steps are change the default storage class of the bucket and then change the storage class of all the objects in the bucket. One more interesting thing that you can actually look up is that you can actually edit the metadata of an object. So I’m actually inside the bucket, I’ve traveled to index HTML, and over here I’m saying edit metadata, and over here you can see that I can edit object metadata.
You can specify content type, whether it’s a PDF file. In that case, you might have to put content type as PDF. You can also specify the content encoding, disposition, and cache control. You can also specify the content language in here. So this metadata can be used to add additional information on top of the existing object. So the object is a setup of bytes and you’d want to add metadata like Content Type, Content Encoding, Content Cache Control. You can do that in here as well. You can add your own custom data here as well. I’m sure you’re having a wonderful time and I’ll see you in the next step.
- Step 10 – Playing with gsutil – Cloud Storage from Command Line
Come back. Instead, let’s play with Cloud Storage from the command line. Important thing to remember cloud Storage does not use Gcloud. It makes use of gsutil. If you type in Gcloud Hyphen Hyphen version, you would see that one of the things which is already installed is gsutil. In cloud shell, gsutil is already installed as a gcloud component. If you are using this on your local machine machine, then you need to first install the gsutil component for Gcloud. Let’s look at some of the important commands. Let’s look at gsutil make bucket. It’s very easy. So this is gsutil give the name of the bucket. gsutil make bucket GS and the name of the bucket. Important thing to remember is how the bucket is named. So GS colon slash slash followed by the name of the bucket.
So let’s say I would say my underscore bucket underscore in 28 minutes underscore shell. So this would create our bucket. Oops we have not really configured a project yet. So let’s actually first set a project. G cloud config set project and this cool it’s sending the project and let’s see if I’m able to execute the command right now. Okay, you can see that it says the bucket is created. How can I list everything inside a bucket? It’s gsutil LS. So instead of MB, let’s say LS. LS. So LS actually lists only the active objects in a specific bucket. We don’t really have anything in this bucket, so it’s not listing anything. However, if you want non current object questions, if you have versioning turned on on a specific bucket, and if you want to list all the non current object questions as well, you can say LS Hyphen A.
This would display the current and the non current object versions. If you want to copy objects from one bucket to another bucket, you can do that this way CP you’d give the object path to the source object and the path to the destination. If you want encryption while copying, if you want to add encryption, you can say I would want to use this specific encryption key. You can also move objects inside the same object. You might want to assign a different key or you might want to move an object from one bucket to another bucket. In those kind of situations, you choose move. So GS util move the old object name and the new object name. Or you can move from one bucket to another bucket. Old bucket name new bucket name old object name new object name.
You might also want to change the storage class of a specific object. In those kind of situations, you can use gsutil rewrite. gsutil rewrite Hyphen is the specific storage class. For example, near line, cold line and then the specific bucket object GS Util CP which we talked about in here to copy an object from one bucket to another bucket can also be used to upload and download objects. You can say gsutil CP, local location and where do you want to upload it to? GS util CP a bucket object path and local location. Where do you want to download it to? Let’s look at a few more cloud storage commands. gsutil you can turn versioning on and off on a bucket versioning set on off you can set uniform bucket level access on a bucket. So uniform bucket level access set on or off.
If uniform bucket level access is enabled, then there is uniform access for all objects in the bucket. If uniform bucket level access is off, then you can configure permissions to objects at the level of individual objects. If you want to set permissions for specific objects, you can use gsutil ACL ch if uniform bucket level access is off, then you can set access control list at specific object level. You can use this command jsutil ACL access Control list ch change permissions so gsutil access control is change. I am specifying all users as the user. I want to give read permissions. That’s why I’m specifying are in here and you can specify the path. You can say the path of the object which you want to make public. This would make a specific object public.
If you want to give access to write for a specific user, then you can say for this user john Doe. John Doe@example. com I would want to give write permission and the path to the object in the bucket. As you can see in here, there are multiple ways in which you can specify permissions. R here stands for read. Write is everything we wrote complete write in here. So you can use read or r write or w owner or O. The other things you can also set is the scope. So over here we are saying all users. Over here we are saying specific user. You can also say all authenticated users. Only users who are authenticated with their Google accounts will be able to access the objects.
The other options are you can either use a group or a project hyphen gnspecifier group, or you can say hyphen p and specify a project. You can also use JSON files to set ACLs on the bucket. The other way to play with permissions is to set IAM accesses and the command to do that is IAM ch so if you are playing with IAM, it’s im. If you are playing with ACL, it’s ACL. However, the structure of the command is a little different. Over here it’s member type colon member name colon IAM role on the bucket. So we are assigning access to a specific member on the bucket. So gsutil I am ch he’s a user. Member type is user. This is his member name or the email. What role do you want to give? I would want to give him an object creator role.
Or if you’d want to make the entire bucket public, then you can say gsutil IMCH all users every all users should have object viewer permissions on this bucket. So we talked about how you can actually give permissions on the entire bucket. How have you can give permissions on specific set of objects? These are all permanent permissions. We earlier talked about the fact that creating signed URLs is a temporary way to give access to somebody. If I would want to give somebody access for ten minutes, I can specify ten minutes. I can use a service account key.
For each of the service account, you can create a key. I’ll make use of a service account which has access to do the specific operation that I would want to perform. I’ll create a key for it, I’ll make use of that key in here, and then I can execute the gsutil sign URL command to create a signed URL for this specific object. Signed URL will give the user access for ten minutes to be able to access that specific object. In this step, we looked at how you can actually play with cloud storage from the command line. We’ll make use of the GS Util utility or the GS Util component to play with cloud storage from the command line. I’m sure you’re having a wonderful time and I’ll see you in the.