DA-100 Microsoft Power BI – Part 2 Get and Transform Data: Level 1 – Home Part 1
- Welcome to Part 2: Get and Transform Data
Hello and welcome to part two of this course. And in this part we’ll be looking at the Getandtransform section of Power Bi, which uses the M language. So you can see that most of what we’ll be doing is in sections one to three. So we’ll be getting data from different data sources. So really from now, all we’ve done so far is used Excel, and we’ll be continuing to use a lot of Excel simply because I know that’s what most of you have got on your machines. We’ll be looking towards the end of this part about other types, like, for instance, SQL Server and SSAs, but we’ll be largely sticking to data sources that are known. You’ve got such as text documents, for instance.
We’ll be looking at profiling the data, so identifying data anomalies and so forth. But the majority of the time I think we’ll be looking at cleaning, transforming and loading the data, which is part three of the Power Bi requirements. So we’ll be identifying keys, we’ll be evaluating column data types. We’ll be combining queries, we’ll be using the M chord in the advanced editor. We’ll be resolving inconsistencies. We’ll be doing lots of transformations to the table structures. And then we’ve also got one additional item which is applying AI insights.
So I hope you’re going to have a very insightful time with all of this so you can see what you can do before you get to the visualization stage. So we’ll be looking at a big menu at the top of the screen and we’ll be dividing it into sections. So we’ll be looking at the Home tab, for instance, for quite a few videos and then moving on to the other tabs I hope you enjoyed. And now let’s get started with part two.
- Introduction – let’s Get some more Data
Hello and welcome to part two of this course. In part one, we had a look through visualize data, fairly extensive look. So now you’ll be able to get some data, just load it in and visualize it in all sorts of different ways and you’re able to share it as well. Now we’ve only had a brief look at Get data, but there is a huge amount of hidden depth.
As you can see, we can use something called M formulas. So the Mashup language that allows us to transform the data, since sadly, not all the data that you receive is in the perfect format. Now we will still be skipping this ad modeling. We’ll have a look at that at part three of this course. But for now, we’ll be concentrating on the Get data and then the practice activities will be asking you to get data and then visualize it.
So we’ll be building on what we have previously gone through with all of the visualization, but we’ll be concentrating more on the Get data. But at the end of the practice activities, you will have an end result, which is a visualization of the data that we’ve just got. So the Get data, as we’ve previously seen, is here in the Home tab and you can see various common file formats. Now, if you’re using Excel, you’ve also got a Get data in the data section. Now this has been changing over the last few years. It used to be around here, but now it’s moved all the way to here.
And you can see fairly similar sort of types that we’ve got. I think they’re more extensive in the Power Bi, but still the main types are still there. So if your type isn’t here, then you can click on more and you can see the extensive number of file types that we’ve got access to in all of these different categories file Database, Azure, Online and so forth. So what we’re going to do is just load at the moment some data about some CDs that I own. So you can see here the data and this workbook is downloadable, it’s called the Power B data. And you can see there are quite a number of tabs in this book and there’ll
probably be more by the time you actually see this because I’ll probably be expanding this. And this is downloadable from the download section right near the beginning of this course. So you can see we have a number, we’ve got file name, we’ve got some hyperlinks and we’ve got some dates. Now the important thing about this data is it’s got some different data types.
So we have got integers, we have got text, we have got dates and that’s really why we’re using it. Just for this brief look at the Power Bi Get data, we’ll be loading lots of other different types of data into Power Bi. Again, this will be suited to what we are looking at so for now, let’s just get the data. We’ll go home, get Data, go to Excel, find the data that we want. Right, we’ve just added it to this folder. So Power Bi Data, click Open and you’ll see the computer goes, okay, I know there’s lots of different types of tabs. What is it you actually want? Well, I want the CDs, and you can see a preview of that. So that’s good. So I will check it. Tell ticket and I’ll click Lord.
Note that there is another option called Transform Data. So if I load it is literally just getting the data. And as you can see, it’s in a bit of a sorry state. If I want to edit this data, I can click on the dot dot next to it and go Edit Query. And that will get me into the Power Query Editor. So this is the Get data editor. But what I’m going to do instead is I’m going to delete this data and I’m going to add it again. And instead of just saying, Lord, I want instead to transform the data. And that gets me into the Power Query Editor. The Power Query Editor is where we’ll be doing most of our work in the second part of this course. Now, there’s quite a lot for us to explore in this, and we’ll have a look at the key points of this editor in the next video.
- Exploring the Power Query Editor interface
So let’s have a look around the Power Query Editor window. So on the left hand side we’ve got the queries. So these are all of the Get data that you have just loaded and it could also be a query, can be made up of another query. So maybe you have altered a query and maybe you have combined two queries together to form a third query. So all of your queries that you’ve got loaded are over here and you can select anyone by clicking on a different query. On the top we have got several menus, we have got the File menu. So this contains things like clause and apply, which you can also see here in the home menu. What this does is it takes you back to the Power Bi desktop. Now you’ll notice in the Power Bi desktop it says there are pending queries or changes in your queries that you haven’t applied yet. Notice that we haven’t got any queries whatsoever.
Started from a blank slate. So if I was just to go close, whether I go in the file or in the home menu, then it will close the Power Query Editor and take it back to the Power Bi desktop. But I don’t have my new query so I’ve got to click on apply changes and when that happens it then loads the information in. You can see the information not done terribly well. Column ten, column eleven. Don’t worry, we will be correcting that very shortly. So let’s go back into my query. So in addition to close and apply, we’ve also got other things in the whole menu. These are common things that you can do. So new queries, so you can get more information, you can refresh. So maybe your spreadsheet or maybe you’re getting something from the web, maybe that’s changed. You can manage the columns, reduce the number of rows that you see, in other words, filter in some way. You can sort, you can transform. So maybe you have a column that you’re splitting into several and you can merge and append queries. So combining multiple queries into one. Now we’ll be going through the vast majority of this in a little bit of detail with an appropriate data source. So this is just a quick overview of everything. So there may look a lot of stuff there and there is there’s also a lot that’s actually hidden away as well. The Transform and Add column menus do have a fair amount of similarity.
For instance, you can see statistics here and statistics here. What the difference is, is that transform takes a column or columns and changes it, whereas Add column may take an existing column and adds a separate column. Alternatively it may just add a new column. So for instance, maybe I wanted everything to be numbered starting with the number one and view. We have a few other things in here and again we’ll be going through the vast majority of this in this part. Now on the righthand side we have got the query settings.
So first of all we’ve got the name. So if you didn’t like the name of the query then this is why you can change it. But this is the most important point. This is called applied steps. What this is, is the computer’s instructions of what to do with your data. So it got the data, it did something called navigation, it then promoted the headers. In other words, it took what was in Raw one and promoted it up. And then it gave certain columns a different type of field type, whether it’s number, whether it’s text, for instance. So if you’re used to visualbed hip applications, then this is a sort of macro. It says do step one, step two, step three, step four. One of the good things about it is you can go back to an earlier step and see what happens.
So here is my existing source. So here are all of the spreadsheets and then I went to one specific spreadsheet and here you can see row one. We’ve got all of these, well, the catalog for instance, it then got promoted into the headers and the file type, the column type you can see it says ABC, one, two, three. We’ll get into what that means later. But then it changed it into ABC. So the good thing about all of this is because it’s a series of steps that the computer is doing, you can change those steps. So suppose I wanted to change which data source I was going from. Well, you can click on this little star next to the source and you can say, well, it wasn’t this spreadsheet I wanted, it was another one. However, there is a danger.
Let’s say we’ve got four steps, ABCD. Let’s say we took out step C. Is there a guarantee that Abd would work? Possibly not. So while you can delete steps, it is quite dangerous to do so unless you know what you’re doing. If you’re deleting the last step, well, a computer can do A-B-C no problem. It’s just when you’re missing a step equally if you’re changing what was in a previous step, then there’s the navigation. So which spreadsheet to go to? You might find that there are some unpredictable results coming up. So if I change this for instance to a different one, construction click, OK, eventually we get an error. The computer doesn’t know exactly what’s happening, so be careful when editing previous steps. But we’ll go into lots of detail as to why you should when you should.
In this part we’ll also be going into why it’s good that the computer is recording this as a series of steps. If you found that you have got a fairly complicated data source and you’re making all these changes to get it exactly how you want and then the data changes, well, that’s no problem. You just refresh your data and the computer will still go through all of the various data sources, all of the various steps that you’re doing. So the steps are independent from the actual data. You can change the data and still have all these functions happen to your data. Now, the last thing I want to talk about, or just a preview, is this bit, the formula bar. Now, you’re probably quite used to a formula bar in Excel, but you’ll see that these formulas are something that you have never used before and some of them can look really scary. In the next video, what we’re going to do is have an introduction to all of these functions. Find out when you do need to know about it, and find out when you don’t. And you might be pleased to know that for the vast majority of beginning to intermediate tasks, you can completely ignore this formula bar. It’s when you’re getting into the advanced things that you should get to ignore this.
So, in the next video, we’ll have a brief introduction to these formulas used in the Power Query Editor. Just before we do, I just want to say if for some reason you can’t see this query settings, maybe it’s no longer there. You’ve closed it on a previous occasion, then just to get it back, just go to View and Query Settings. It’s very important. And I’ll have this on screen all the time. Basically when I’m using Power Query Editor with these queries on the left hand side. If you don’t want to see them, just click on this left arrow, this less than sign and it’s collapses. So it doesn’t actually get removed, but just collapses. And if for some reason you can’t see the formula bar, I don’t know why you would ever want to get rid of it, but if you did, it’s also there in view as a checkbox. So in the next video we’ll have a look at these formulas.
- Introducing the M language
In the last video we had a look at the fact that there were formulas in the formula bars. But these are very different types of formulas than are available in Excel. You’ll notice that each of these formulas are in two parts. There is a word and then a dot, a full stop or period and then another word. And you’ll also notice that there are some words which are capitals. So what is all of this? Well, this is called the M language, m for Mashup. And what this is for is to translate each of your steps into different formulas. So you can see here we have a formula which has Excel workbook file content. And in there is the name of the workbook. So it might look a bit alien at first, but if you can see it’s going to do something to an Excel workbook and it’s going to do something to the file contents and there’s the file. Then it starts looking a bit less strange. So we got Excel file source table and table. In fact, the vast majority of the Mlanguage functions are this two word naming.
Now, if I just Google Power Bi M language, you can see for me the top item is from Microsoft. com. And if you go into there, you will see that there is an awful lot of detail get information over the 700 M functions. Well, I click that and here we have all of the different functions. So you can see all of these words are the first part of the function. So we’ve had Table dot before. These are all of the Table functions and you can see that there are an awful lot of them and it would take you ages to read all of this. Now you can read all of this by clicking on download PDF. So if I do that, you’ll see that it downloads an 800 page, very detailed document all about the M language.
And if you are going to be some sort of M guru, then you might want to have read this just once. However, there’s an awful lot of functions and for this course we don’t need to concentrate on even half of these functions. So these are the functions that we’ll be using as part of this course date time, date zone and time day. That just gives a list of the days from Friday to Thursday, duration, list number, table and text. Now within this and this is available as a PDF in the resources part of this course. Just go right back to the beginning of the course. You’ll see a lecture called Resources. It’s attached to this. I have listed all of the main functions. Now you’ll notice there’s a lot of repetition. For instance, date is in the current day is in current month is in current quarter it’s the next day is in previous day. So what I’ve done instead is I’ve given you the important ones. But then I’ve highlighted the really important ones. Now, if you are going forward to the certification, then my advice would be to memorize the ones with a yellow highlight. So here are all the day functions or date functions. Date time functions is only a couple that I really suggest you memorize.
From and to are very important functions, but they’re there for all of these different functions. So I’m not highlighting them in each page. And at the least for the certification, you should be acquainted. You should know that these functions exist date, time zone. So that’s just giving you the time now and so forth. These are the day values, durations, JSON and XML functions I’m putting in there because we will be towards the end of this part, opening a JSON and XML document. But you don’t really need to know about these, too detailed. Here are the list functions and we’ve got things like the Min, the max, the Sum, the count, non null count. Here are your number functions. A lot of these you can identify from Excel.
So mod, power, round, sign, ABS. For instance, the table functions, as you might expect with a table of data, are very numerous. But when you look at it, there’s only around 20 functions which I’m saying are really important. And a lot of those, like first den and last den can be paired together text functions. So if you think of the text functions in Excel, then these are they in Emma. Now, in case you’re wondering, M is based on another language. Now, you may have heard languages like Visual Basic, VB or C sharp. Well, this is based on the language called F Sharp, which is a fortran language which I think next to nobody uses. I may be a bit prejudiced about this, but C Sharp and Visual Basic are the main languages. So why they decided to invent a new language based on the language that very few people use, I don’t know. And then finally there are time functions.
Now, there is something to be said about M functions in that they are case sensitive. So you must have each word starting with a capital letter. So you can see text position of any you can see all the capitalizations. Thankfully the capitalizations are regular. So if you know the command is reverse rows, then you know that both of those I’ve got a capital R at the beginning. Now, another good thing about all of this with the M language is that everything that we’ve got in the home transform and add column will be translated into M. So that’s why I’m saying if you’re going to use this for basic or intermediate, you don’t really need to learn anything about M.
The computer will interpret it for you. However, there are some really good things later on if you do learn the M. Now, as we will be going through this course, I will be mentioning each of the m formulas, because as I say, I think it’s important. If you have even so much as a nodding acquaintance, that’s a good way to start learning. Now, I have created a Word document which contains all of those 860 odd pages in a lot more readable fashion. It doesn’t contain all of the examples, it doesn’t contain all of the detail, but what it does contain is a lot fewer pages, only 86 pages, according to this. But I still think that’s actually a fair amount to read quite a bit. So what I’ve done is I have created another manual which is called Power Bi. Manual reduced. And what this does is it looks at those important functions that I was looking at earlier.
So rather than wedding through them all, you’re now wedding through a select example. So we have got examples here, we’ve got the syntax, we’ve got in Words what it actually does, and it does this fairly swiftly. Now, if you want the full thing of anything, say you want a date dot week of month, then please go back to your original reference. So I just remember, I’ve just Googled Power BIM language, we’ll look at date, we’ll look at week of month, and that gives you more information about it. So what I’ve done there is just giving you a precise but if you want more information, then go onto the internet. That’s what it’s there for. But what you’ve got in these documents is enough to get to groups with the majority of the commands. And sometimes I’ve even put down the Excel equivalent.
So date week of year is the equivalent of week NUM. So in the next video, we’re going to actually start looking at the menus. We’re going to go roughly from left to right, starting from home, and then going to transform and add columns. And we’re going to tackle at each stage a problem. So we’re going to demonstrate what these functions are for and why you should use them and when you should use them. So, for instance, we’re going to start off with this problem. The column headings are nothing like what this should be.
- Let’s start look at the Home tab
In this video we’ll be looking at the first half of the Home tab, the first bit very quickly to be honest. So in the home tab, we have close and apply. So that closes the Power Query Editor and updates the Power Bi desktop with all of the new data we have apply which doesn’t close the Power Query Editor but does apply to the desktop. And we have clause that just goes straight to the Power Bi desktop new source. This allows you to get more data to add additional queries. Recent sources. These are the data sets that you have previously used and enter data.
If you just want to enter some data straightforward, just start typing. So here is my data and I want an extra column want my data to. So if you actually do want to create a new query source, we’ve just entered data very infrequent, you have to do this that’s how and then click OK and it appears as a new query over here on the left hand side. So I don’t want that. So what I’m going to do is I’m going over here to the query section and I’m going to delete. So that removes the query from my Power Query Editor. Now, there are two more options in the managed query that I’d like to talk about duplicate and Reference. Now, you remember I said we had all of these steps. Duplicate gives me an exact equivalent with all of these steps. So that means that now if I add a new step here, let’s just change this data type. No particular reason I want to add that as a new step.
It’s going to give me lots of errors. It doesn’t matter. Now, this first query has got five steps, this second query has only got four, so they are independent queries. However, Reference just says well, give me that previous queries results. So if I give you a reference of this, then you can see the applied steps is give me a source of CDs. So it just gives me a CD. So if I now add this additional step to my CDs query, then we have this additional step having already taken place in our new query. So all it is is a reference to the previous query as opposed to let’s duplicate it, let’s give you all of the steps and then you can change eva as you want.
So whichever you want to do, it’s perfectly fine by me. If you do want to have a second version of the same query. The question is if I make any changes to the first query, do I want it to apply to this second query? We’ve also got Refresh so that will refresh one query or refresh everything properties. That just gives me an opportunity to type in a description. There’s a couple of other tick boxes here including Report refresh just means it’s refreshed when you click on the Refresh button and enable load to report. Enable load to report means that some of the data will come into here.
That’s all. That means nothing major. Advanced editor that gives all of this M code. We’re not going to touch that, if at all. I think we will later on when we talk about custom formulas, but let’s stay clear of that. To be honest. We’ll have a look later at parameters and the data source settings. This is more useful when you’re looking at SQL Server where you have to put in a password. Not so much of a problem with Excel. So that is the first half of the home toolbar going from clause to query. Next video we’ll have a look at Manage columns, reduce, Ros and Salt.
- Home menu – Manage Columns
In a sense the Manage Columns and Reduce Rows does one and the same thing. It allows you to reduce the number of columns that you’ve got or reduce the number of rows that you’ve got. So choosing columns, there are two options within this. Choose columns and go to column. Got. A column just highlights a particular column. It doesn’t apply an additional step. It’s just like me clicking on a column at the top. Now, if you want to highlight more than one column, you can do that by clicking on a column and then going to another column and press holding down the shift button on your keyboard and it will highlight all of that as a range. So I’ve highlighted column two, column three and column four by clicking on column two, holding down Shift and clicking on column four.
Alternatively, you can highlight disparate columns. In other words, not just a range but this column and this column and this column by clicking on one column and then holding down Control and then clicking on any other columns. So none of these column selections that we’ve just had a look at actually are a step. It’s only when you do something to those columns that they become steps. Now Choose Columns is saying what columns do you want to keep? So let’s say I wanted to keep only the first two columns. Well I can select all or unselect all of the columns and then select the first two columns. Click OK. And that’s all I get, these first two columns. Now you’ll notice that the formula at the top is Table Select Columns. Now this is in two parts.
The first part references the previous step. So the previous step was called Change type One. So it says from Change type one give me and then it gives me a list and this list is in curly brackets. So from that you can see we are in the table sort of functions. We are selecting columns, we’re selecting this table and we’re saying we just want these columns. Now incidentally, if you don’t want any step to be called what you are calling it, that’s fine, no problem. Just right and click on it and go to Rename. So I’m going to call this Create errors because that’s all that step did. And now you’ll see that this function formula is calling Table select Columns, Create Errors. This little symbol which is called the hash or the pound sign is used to say what follows is an identifier. So something which is called create errors. Give me this range of columns. Now, just as there is something called Table Select Columns, there is also something called Table Remove Columns. So if I’m going to remove columns, we’ve got no dialog box here.
I can either remove the column or columns that I’ve highlighted. So that’s removed the first column and you can see the formula Table Remove Columns. And I want that particular column removed, it’s still a list, even though it’s a list of one, so it’s still in curly brackets. Or you can say, I want every single column to be removed apart from this particular one. So remove other columns. And so we get exactly what we had a few minutes ago, table select columns. Even though we’ve clicked on remove other columns. So you can see in this interface there are many ways to do exactly the same thing.
Now, while we’re talking about columns, we can also rearrange the columns. So let’s say I want a column four to become first. Well, all I do is highlight column four and just drag it all the way to the left. And you notice that becomes another step. And it’s called table. The formula is called table reorder columns. So this is quite useful because let’s say I wanted to reorder a lot of columns. Well, I don’t have to go around dragging each all over the place. I can just change the formula. So I want column 13 to be after column four. No problem, I’ll just change the formula. It’s done equally. If I was now dragging column two at the very beginning, you’ll notice that it has not created an additional step. What it’s done is changed the formula for the previous step.
So don’t think, well, I’m going to move one column to the left and then I’m going to use a different column to the middle and another column to the right. Then that’s going to create three steps. It’s not the computer is intelligent enough to go, ah, let’s add this to the previous step. So in this video, we had a look at how to highlight columns, either by just clicking on them or by choosing Go to the columns. Incidentally, Go to Column is also available in the view section if you want to use it there. We also removed any columns that we didn’t want. We could either choose the columns we wanted to keep, or we could highlight the columns and say, I want to remove these columns or remove other columns. And we also saw how to reorder the columns as well. And the vast majority of what we’ve done didn’t require one single typing of M chord. No formulas whatsoever. It created them. And again, how much you want to use them is down to you. I’ll keep mentioning them, but when we’re just using the interface, you can completely forget about them if you so wish.
- Home menu – Reduce Rows and Use First Row as Headers
So in this video we’ll have a look at how to actually correct the problem that we’ve got here in that the wrong columns have been promoted to the headers. So you’ll notice that we have the Promoted headers all the way up here and it’s promoted the very top row and quite often that makes sense and that’s what the computer did automatically. But in this particular case the computer got it wrong because we just had this funny sort of header at the beginning and then it was the table. So what we’re going to do is go back a few steps. So we’re going to delete this creating error step that I put on and this change type and this Promoted headers.
So now we’re back to the original data. Now notice of in the transform section we’ve got use the first row as headers and there’s also the vice versa, the opposite use headers as the first row in case the computer makes a mistake. But how do we get row four to be the first row? Well what we can do is delete the first three rows and we do that by reducing rows. Now we have got keeping rows and we’ve got removing rows. So keeping rows is saying I want to keep only the so really it’s missing the only bit. I want to keep only the top six rows. So you can see the formula there table first N, name of step, comma Six. And if you’re wondering why it’s called CDs Sheet or the navigation has got CDs and sheet data. Now if I want to adjust this and say no I want eight, then you can click on the settings and say no going to that, I want eight rows. Or maybe you want to keep the bottom most rows fairly unlikely to be honest.
Or you want to keep a range of roles say from rows three to five. So we’re getting closer, we want to keep from row four downwards but that didn’t quite work and in fact when I said three to five I meant row three four five rows. If you notice the formula table range it actually says comma two instead of comma three. That’s because M uses a zero counting system. So if I was to count numbers in zero it would be 01234 so the third row is zero one two. You can also keep duplicates adjust all of those rows which have duplicates but duplicates in the highlighted columns. So for instance if I wanted to have all of those which had duplicates in ten Romantic works and nothing else well let’s say I wanted to keep the first 20 rows and then I just want all of those which have a duplication in this column, then I can keep the duplicates. You can see it’s kept the knolls at the top and it’s kept the ten romantic works as well. But quite often you won’t be keeping rods, you’ll be removing rods and the remove rolls happens exactly the same way. We’ve also got an additional option which is remove blank rows. And again, this is based on the entirety of the raw. It just if it’s completely blank, it gets removed. So what do we need in this particular option? Well, we need to get rid of the first three rows and we do that using remove rows, remove top three rows. But we can also remove the bottom rows.
We can move every alternate rows or maybe something else. So we can say, actually I want to remove you to go to the fourth row, I want you to remove two and I want you to retain three, for instance, and you can see 18, 1920, remove 221 22. So you can see that that’s worked very rare that you’ll be using those remove errors you might be using more often. So if I change this so that they’re all numbers, that creates a lot of errors because there was a lot of text there which can’t be numbers, I can then remove all of the errors so everything that contained an error in that particular column. So in all of this we’re using things like M formulas like table alternate rows, table distinct, table remove rows with errors. So let’s go back to our specific example. I want to remove the top three rows. So I’m going to click on remove rows, remove top rows and remove top three rows.
So that uses table skip, not table remove rows as you might expect, but table skip. So we are skipping the top three rows. Now you’ll notice that in the formula there is a more advanced version where you can skip rows that say remove rows where there’s a certain condition, but that’s a bit outside of this particular course. Have a read of that if you’re interested in the documentation. So now I have got the first three rows skipped. I am now going to promote this now top row to be my header. So use first rows headers. And now finally we have changed this so that the information makes sense. The computer has also done something else called change type. We’ll have a look at that in the transform section, but we now have the column headings we want. We no longer have this column two, column three, column four. So let’s close and apply. We go to the fields and you can see the fields change from column ten, column eleven, column twelve and so forth.
And what that change type has also done is it said some of these are date fields. So we’ve now got date hierarchies automatically added into all of this. So when you get your data, the computer will quite probably have promoted headers. It will have put the top row of your data into headers. If you’re taking this, say from an Excel table, it may have got it wrong. You may have to remove certain rows first before you can then promote the headers. So this is a real reason why it’s very useful to have this as a macro, so that if you had new data coming in in the same format, you can just replace the new data, the old data, with the new data, and it would run through this routine. It would take off the top three rows, it would promote it as headers, and you will have the latest data immediately at your fingertips. So this is reducing rows and using the first row as headers.
- Practice Activity Number 8 – The Solution
Hello, how is this practice activity for you? So we just started part two and we’re getting data and transforming it. So we’re not just going to say this is the data and it’s got to be perfect, we can do some alterations. So first of all I asked you to get some data. So you can either get data in Excel or you could probably go to recent sources and go to the Power Bi data. So then you need to select the spreadsheet, Pvt, HP, Admins, so that’s this one here. So now I’m going to transform the data.
So this opens up the Power query window. Now the next step says oops, it looks like the wrong headers are in the header roll and you can see we’ve got column three, column four, column five, and if you have a look at the applied steps, it’s gone into the original source. So the source being the Excel workbook. And you can see all of these tables and we’ve chosen which particular table we wanted. So having done that, it has then promoted headers, but it’s promoted the wrong headers because it didn’t know that we have got this sort of empty line at the top. So we need to get rid of the change type and the Promoted headers. It’s important that we don’t just get rid of say Promoted headers because as you can see, it could cause the query to break if one step relies on another step. So let’s cancel that and go from change type delete, promoted headings headers Delete.
So what we need to do is get rid of the very top row. So I will go to Home reduce rows, remove top rows, so I’m removing the top one row. So now we have the correct heading row as row one, but we don’t want it as row one, we want it in the headers which currently say column one, column two, column three and so forth. So now we go to Home Transform, use first row as headers. So now we have date. Greater Manchester, Merseyside, South Yorkshire and so forth. Now you can see that the computer has added an extra step that I didn’t want, I just wanted it to promote headers. But this is the computer trying to be kind to you, but I don’t want it to be kind. And you can see that there’s a problem in the change type. There’s this little red underline and that indicates that there is an error somewhere. So I’m going to get rid of this change type and we’ll be looking at changing types in the next section. So next we don’t need any blank columns or the grand total column.
So let’s go across and you can see we’ve got a column nine, column ten and grand total column. So what I’m going to do is remove columns. So either I could choose columns, in which case I say I want to keep these columns and you can see in the formula it has specified exactly which columns it wants to keep. Or I could say, I want to remove specific columns. Now, you can see I’ve got to select the columns first. It just removed the Date column because that’s where I was highlighting. So I could just change this up here in the formula bar to say I want to remove the grand total column, for instance. And then I could say I want to remove column nine and column ten.
Alternatively, what you could do is highlight all of those columns, click on the first one, hold down Shift, click on the last one, or click on the first one, hold down CTRL and select all of the columns you want and go remove columns. Alternatively, you notice that it’s combined them all in one formula. You could just remove one column and then remove another column and another column, and it will still combine them all into the one formula. So whichever works for you. Next, I’ve asked you, could you move the West Yorkshire column so that it is the second column? So I want it right in between date and Greater Manchester. That’s no problem. I’ll just click on it and I’ll just drag it and go to wherever I want. So I want it just there and then finally, I asked you to create a visualization based on this data. Well, I’m going to click Close and apply this will exit Power Query Editor and go into the Power Bi desktop. So you can see there are pending changes in your model.
So now we have got all of this, I’m going to create a suggested a stacked column chart. So there it is. Let’s expand it. So we’ve had a look at Creating Global Visualizations in part one of this course, and I’m not going to go back over all of that again. But by asking you to do visualizations as part of this practice activity, it keeps it up in your mind as to what you need to do. So you will keep creating visualizations, which at the end of the day is sort of what Power Bi is all about. So I’ve put the date in the axis and I’m going to put Greater Manchester, Merseyside, South Yorkshire, Tang and Weir, West Midlands and West Yorkshire in the values you notice each of them has a value of one and when I click on it, I can’t change it to a sum yet. Also, you might want to note that West Yorkshire, even though we promoted it in between date and Greater Manchester, hasn’t stayed there.
So what could we do with West Yorkshire and West Midland so that they’re actually proper numbers that we can interpret as numbers? Well, I could have click on each one of them and go into column tools and say that these are numbers. However, I actually prefer to do this at the Get and transform stage because we’ll probably be doing a lot more transforming in the get and transform stage and then it’s all complete. Also notice that the dates look a bit odd 1 January 1995, 1 January 1996 and so on. So we’ll be looking into how we can remedy this, how we can get this to actually work properly in the next practice activity. So please save this query and reopen it when we get to the next practice activity.
But for now, what we have done, we have gotten transformed a spreadsheet, we’ve got the right draw in headers, we’ve deleted steps when we needed to, we deleted columns and we’ve moved a column from one part of the Power query window to another. So we’ll be developing our skills so that any problems that we have with the data, such as what you’re seeing on screen, can be remedied before it gets to the visualization. Well, I hope you enjoyed this practice activity and I’ll see you in the next video.