Python Institute PCAP – File IO and Exception Handling in Python Part 3
- Traversing Directories Using the OS Module
Right here. Now the next thing I want to look at is traversing directories. Now before we traverse the directories, I want to create a directory structure specific. For this example, I’m going to open up my Finder window and go to the desktop and let’s create a well, I have the My folder folder here, and I think from the previous lesson we removed some of the files and folders that was inside of this folder, right? So what I’m going to do is I’m going to create another folder in here. We’ll call it stuff in my folder. And so inside of Stuff, we can have another folder called Data and in Stuff, excuse me, in the My folder, I’m just going to move one of these files in there. Let’s move sample text two in the My folder folder. Okay, so now sample text two is in this My folder and I’m going to move in another file. For example, this renamed text. I’m going to move that in the subfolder.
So let’s do that. Let me first move it into My folder, and then this renamed folder is going to go into this Stuff folder right there. Okay, in the Stuff folder we have the data folder that I created and then we have this other file. It doesn’t matter, these files could be anything. I just want to show you how you can walk these directories and figure out what a directory contains and then go into each of the files. I have this image file right here, which I’m going to move into the My folder as well. And we’re going to move this further in into the subdirectory called Stuff. And then I’m going to move this image folder into the Data folder. Okay, so as you can see the structure, we have the My folder, we’ve got the subfolder called Stuff, and then we have a data folder that is inside of Stuff. And each of these folders have another folder as well as some file or some data, okay, except for the data folder that just has a file. So now that we have this directory structure set up, I’m going to show you how you can walk this directory and get information from it.
So there’s a handy function or method rather called the walk method in the OS module. So we can do OS walk. And here we specify the actual location of the particular folder that we want to walk. And so let’s give that directory location that’s in My users Mtrsmod. That’s my home directory desktop, My folder. And we have to put a slash in the beginning here as well so that we can properly get to that directory. And this will allow us to walk this directory structure. But the way we do it is using A for loop. So we can do four A-B-C in OS walk. And if you recall, this is known as unpacking. So this gives us three different types of data. This walk method, it gives us the directory path, the directory names as well as the file names. So a more appropriate name for each of these variables would be well, I’ll let you figure it out. Let’s print each of these. I’m going to print A, I’m going to print B and C as well. And then you’ll be able to see what the data is at each of these levels.
And I’m also going to put a breaker here so that we can separate out when this iteration is complete. So let’s run it so that you can see what is printed. So I’m just going to right click and run it and let’s explore what’s going on here. So this first line of course is just this is what gets printed every time you run a program in PyCharm. It just basically tells which file is being run. And this is application. PY file. So you can ignore that. But the first thing that gets printed here is this. This is A, this is B and this is C. Okay, so what are these things?
Well, the first thing right here, this is the directory path. Okay? And what it tells us is that inside of this directory path we have Stuff and we have this file. Okay, so we’ve got a folder called Stuff inside of My folder path. And we’ve got a file called sample two dot text inside of the same folder called My folder. Okay? And then this extra file here, the DS store. This is actually a hidden file in the Mac operating system that you might see.
It starts with a dot in Windows. You will not see it, but as you can see, it starts with a dot in Linux that’s actually used to denote a hidden file. So you can ignore that. But notice sample two dot text as well as the Stuff directory are both residing in this particular path. Okay, so let’s go to the next example. Notice now we are in one level. Deep inside of My folder we’ve got the folder called Stuff. So since we’ve explored everything that’s in My folder, we saw that this has a Stuff directory and has this file. Now we’re going one level deeper, which is in the Stuff folder and seeing what is in there. And in the Stuff folder we’ve got this file called rename text and we’ve got the data folder, okay, that’s inside of Stuff. Now again, this DS store is just a default file in the Mac operating system. It’s a hidden file. You can ignore that. Now scrolling down further.
Now we go into the data directory. All right, and again, the first thing here, A is the path B. Here is the directory name if there are any directories. And then the C here is the file name if there are any files. In this case we have two files in the My folder. In the Stuff folder we’ve also got two files and we’ve got one directory. And then scrolling further down the data directory is the third iteration and this is the final. We don’t have anything further going into Data except for just this image file, which we have MTL PNG. Now notice B for this third example here for the third folder, data b is empty. We don’t have any directories inside of the Data folder, we just have a file.
So that’s how this is broken down. So a more appropriate name for ABC would be this is going to be the Path, or let’s call it the Dur Path. And then the B is going to be the actual, whether there are any folders in here. So we can say der names. If there are any der names, they would be in this list where we have Data and Stuff. And for C, this is actually the file names if we have any files. And so we’ll call it file names. And as you can see, each of these are lists, the directory names if there are any directories in the given path, and the file names if there are any files in the given directory path. And so let’s rename these variables up here as well. Dirt path, der names and file names. Okay, now dirt path is not plural. As we explore, we go deeper into the given path. There’s only one path, right? And inside of the path is where we see whether it has any folders or files.
Now just to solidify this example, what I’m going to do is I’m going to create some more folders in each of these subfolders. So let’s go back to the desktop, let’s go into My folder. Now inside of Stuff, I’m going to create another folder and we’ll call this Other Stuff. All right, so now inside of the Stuff folder we’ve got Data as well as Other Stuff. So let’s see how that looks when we utilize this OS Walk functionality. So I’m just going to run this again. And now let’s explore this further.
Notice we’re in the My data folder. Now we go into Stuff and notice now inside of Stuff, we’ve got a list here with two folders, not just one. We have the Data folder from before and we have this new folder called Other Stuff. And again, notice it’s in a list. Now the OS Walk is going to of course take each of these folders, go further deep into them one at a time. So the next thing what it does is it goes into the Other Stuff folder, which is this path right here. And since Other Stuff doesn’t really have anything in it, it just has empty the directory names variable is empty here as well as the file names. There’s no file names either. In Other Stuff, it’s just an empty folder. And then it moves on to the Data folder, which doesn’t have any further subfolders, but it does have a file, so hopefully you understand how this functionality works. We’re going to be using this because.
- OS Module Continued
This code because I don’t want that printed to the screen as I go over other examples. And since this is a loop, we need to have something in there somewhere to just put pass so that nothing really happens. We can just move on to bigger and better code down here. Okay? So the next thing I want to talk about is getting the environment variable, any environment variable on a machine, whether you’re using Windows or Mac or any other Linux operating system. So the way to do that is you can use OS dot enviran, okay? And then you use the get method on it. And in here is where we specify the actual variable that you want, the environment variable.
So we have an environment variable here on the Mac known as home. And so I can get the value of what that environment variable has. So we can print let’s just print the contents of running this, all right? It’s not going to print anything else because we commented that out. So we should just expect to see this printed. So let’s right click and run this and notice that the environment variable here, Home, has the value users Mt as a mod. This is my home directory. Now, sometimes people acquire getting values of environment variables and maybe appending to them concatenating files and folders to certain directory locations. So, for example, if I was to let’s add a file called myfile. TXT or something, I’d have to be conscientious of the fact that there’s no slash at the end of that. And so we don’t know what variables contain, right? So one way of doing it is first checking the value of the variable and then adding and saying, okay, we need to have this slash in here and let’s print the value of this going like that, okay? And so let’s run this and see what happens.
Notice now it returns us our home variable, which is here. And then this is the contents of whatever I wrote here. The forward slash and then the file name. But this can get hairy, right? This is annoying to look at. What if you have many files that you want to add, and for each of them you’d have to add a slash and you won’t know whether the slash is part of the environment variable or not. Is there a better way to manage this kind of organization? And the fact is, yes, absolutely. It’s very straightforward, actually. It’s a very handy tool. The way to do that is you can use a join method. So let me get rid of this print here, and let’s also get rid of this for a second. I can use OS path join. And in here is where I pass in the things that I want to join. So I got the environment variable. This may or may not have the slash at the end, okay? But we don’t care about that because this OS Path join is smart enough to know that if we add anything to this variable, it’s going to require a slash or not. So if I put a comma here and the second argument that I add is myfile TXT obviously needs to be in quotes. So let me add that here. This OS path join method is going to be able to join this, concatenate this content together in a more graceful way without us having to worry about whether we need to add a slash or anything else required, right? So now let’s print this and you’ll see. Let’s run it.
Notice it’s smart enough to add that slash in for us even though we didn’t have to specifically concatenate that slash. That’s what you would use this path join for this OS Path join. Now there are some other handy functions that come as part of Path. So let’s try those out. Let’s say you want to get a file name at a given directory location, just the file name. You don’t want the directory. So the way to do that is you could do OS Path base name, and then whatever you pass in here, let’s say it’s Bin tools. It could be any directory location.
But then we want myfile TXT. This base name stands for this particular file. Okay? And what this is going to do is going to return us just this file, not the entire directory location, which is great. So let’s print the result of invoking that. Let’s run it. And there we go. We get my file. So this is used to get base name and that’s the file at the directory location given. Okay? So just this file name, that’s what base name stands for. Now similarly, there’s a dur name. Let’s say you’re given this string sequence and you only want to extract the directory. You don’t want the file. For that, you can use OS path dirname, okay? And we pass in the exact same argument. And this will return us just the path, not the file name. So let’s print the value of this and let’s run it.
And as you can see here, it is in the second invocation here we don’t see myfile TXT, we just see the durname. Okay, so this is going to get the directory name only, not the file. And this directory could be really long. It doesn’t even have to exist on the machine. I could say hello, blah, blah, blah. It could be a very long directory location. It doesn’t matter if we run it. You’ll see that it only gives us the directory location when we use their name. Now let’s say you wanted both each split up. You wanted the entire directory path and you wanted the file. The way we could do that is we could do OS path split and we pass in the same exact thing. So let’s just take this entire thing and paste it here. Make sure you have the quote there’s the string.
And this is going to return us a tuple containing two things. One of them is going to be the directory location and the other is going to be the entire file file name. So let’s print this and let’s run it. And here it is. We get the entire directory location and then we get the entire file name. Pretty cool. So let me add a comment here. We’ll give directory name and base name in a tuple. Okay, let’s keep going. There’s a couple more things if you want to check for whether a path exists or not. It’s a boolean condition. So we can do OS path exists method, okay? And we could just take this entire thing and let’s wrap it up. Wrap it in quotes. Put it right there. Now we know this blah, blah, blah. This gibberish direct relocation doesn’t exist on our machine. So this is actually going to return false. So let’s print what this returns and you’ll see that it prints false. So let’s run it and there we go.
It prints false. So this is used to check for if the path exists on the computer on which this computer is running. So if this particular path exists, then this is going to return true. Now we know that this path doesn’t exist so it’s going to return false. So this can of course be used as an if statement to check. They have similar methods which are OS path, Is der. Okay, this is one method and this is used to check for whether the directory exists in a given path. Another one is is file and this is used to check for whether the file exists in the path given. So for each of these methods, split method, the directory name method, the base name, the exist method, the Is file or Is der method, each of those methods accept a path. The path could have a file or could have a directory. The entire basically path of any file. So the Is file method let me just put a comment here used to check if file exists in the specified path, okay? And that would of course get the entire thing right here.
And a similar thing will happen for the isdir. Let me add that as well. I’m going to provide this file as part of this lecture because I feel like it’s good notes here. And this is just stuff that you’d have to kind of memorize. Nothing complicated here, just a bunch of functions that you’re going to be utilizing in case when you need them. So instead of Is file for the second, it could be Is there. And this of course just checks used to check if directory exists in the specified path. Okay? Is file is there. Hopefully you get the point. Now one more thing that I want to introduce to you and that’s split text. So OS path split text. And this is basically used to get the file with path as well as the file extension in a tuple, I believe. So let’s just paste this here and let’s run this. I’m going to print out the contents of this and you’ll see what happens. So let’s run it. And here it is, the last part here.
Notice it gives the entire directory location, the entire path, as well as the file name, but excludes the file format. So the file format is text and that’s put in a separate slot in the tuple. So split text is used to split the entire file name with its directory from the actual file format or file extension rather. All right, so there we go. Let me just add a comment here towards the end to make sure we get this. So get file with path and file extension in a tuple, okay? That’s what this split text method does. Now, we didn’t print the contents of this because we know it’s going to be false in both of these situations, but hopefully you get the point. These are both Boolean methods. File is there as well as this exists. The rest of the methods that we went over actually returned some kind of data based on the method that we’ve invoked.