DA-100 Microsoft Power BI – Part 3 Level 6 – Filter and Value Functions
- RELATED
Now, this section is all about the photo functions and these are the ones which have absolutely no comparison with Excel functions. Now, these first two sort of do because we previously had a look at vertical lookup tables or something that looked like a vertical lockup table, which was the lockup value function. But you may remember it’s a big, it complicated. The related function which we are looking in this particular video is a lot simpler. And the reason for that is it’s using the relationships that we’ve already got built in. And in this relationship function we are going against the arrows. So we are going from the many to the one. So we’re flowing the other way to the way that the arrow suggests. So remember in an earlier video, right near the start of this part, I said that there was a problem when you floor against the arrow. Well, this is how to get round this problem.
So for instance, this Fact Internet sales table is part of the many where the dim product is the one. So what we can do using the related function is to look up something going upstream. So in the fact Internet sales we can look something up in the dim product, in the dim product subcategory and the dim product category. So let’s see how we can do that. So, I’m going to add a new column which is called Product Name. Now, what would be great is if I could just put in well, it’s in Dim product and it’s the English description. So it would be great if I could just say give me the English description and that’s it. But I can’t quite do that. Instead all I’ve got to do is surround it with the word related is if I could just type dim product, English product name. But I can’t because if I try to do that, there’s no autocompletion. That is sensible.
So what I need to do is surround it with the name, relate with the function relate. So now I can put in English Product Name in the brackets and there I get my English product name. So we got mountain tires, tubes, and then further down we’ve got bikes and so forth. So let’s have another new column. This is going to be for the subcategory and that is equal to related. And I’ll just put in subcategory and there we have the English product subcategory name and then another new column and this is going to be the category. And again, just putting word related and I’ll just start typing in the word category and there we go. Isn’t that easy to do? And yet how powerful this could be if you wanted everything to be in one particular table.
Now of course, when we’re doing visualizations, it’s not necessarily that important because we could get something from one table and drag something out from another table and there’s no problem with that. However, suppose you wanted to hide tables. Suppose you wanted everything just to be in one table. Well, that is possible. If you bring them in now, you might lose a bit of flexibility when you do that, because you can no longer use the other things in the tables, like for instance, the French category name or the Spanish category name. But it could be useful, and it could be really useful when debugging a model or if you want to concatenate two items from separate tables.
So suppose I’m going to have a Nova column, which is category subcategory and that is equal to related category, then an ampersand and a space and the related subcategory. So, again, very easy to do. But now you might be able to see how powerful this could be. So related it allows you in the many table to go upstream against the arrows and pull something in from other tables.
- RELATEDTABLE and COUNTROWS
So that’s all very well going from the many to the one, but what if I wanted to go from the one to the many? For example, I have these categories, what if I want to know for each category what the total number of sales or the total standard cost is inside the model? Now, yes, I can do that very easily, easily by going to my visualization and going okay, give me the category name and give me the total standard cost and it’s this one and that will give me the answer. But suppose I wanted it in one particular table, for instance, I might want to hide this fact Internet sales and just give you standard fields based on what I think the model should be. So we can do that by using a variant of related.
So if I was to go into the category and add a new column and this column is let’s call it standard cost if I type in the word related, you can see that there aren’t any fields that the computer gives me and the reason for that is because we are right at the end of the chain. There is no further many to one that we can go up. Instead of using Related, however, we can use Related table. So what Related table does is it returns a filtered table. So I want to filter from fact Internet sales. So close that and there we go, that is our expression. But unfortunately this gives us an error. This expression refers to multiple columns. So what’s happening here?
Well, Related gives us a single value going up the chain from infinity to one, from many to one. So it goes from fact Internet sales and gives us a category, for instance but we’re going down instead of it giving us a single value, it is, as the name of the function suggests, give us a table. So it’s not just one column, it’s an entire table full of rows, full of columns. Now, can you remember, let’s say I wanted to aggregate all of the product sales costs.
Can you remember what the function is for aggregating a table? So it’s not sum, it is sum x. Sum x returns a sum of an expression evaluated for each row in the table. Well, we’ve got the table, it’s this related table and what do we want to bring from it? Well, let’s say we wanted the product standard cost. So there we have product standard cost. So now let’s press enter and see what we get. And here we have got our standard cost. So you can see we have 16 million for bikes, 203,000 for clothing, 262,000 for accessories notice that components is blank and notice that components is not there on this visualization.
The reason for that is because there aren’t any items which have successfully gone down the chain. What do I mean by this? I mean that there are no components in fact Internet sales. We may have the product but we actually haven’t sold any. And we can show this quite easily if we create another visualization. And we put in the product key. So you can see there are 60,398 of them. We’ll put in product key again, but this time do not summarize.
So we still have the same total 60,398. And we put in the category that we’d worked out previously using the related function in the previous video. And if I now sort by the category and sort ascending, you can see that there are no components, we haven’t sold any components. This count of the product key remains the same. So this is the totality of the data we’ve got. So we just haven’t sold any. Now we’ve got a SUMX. Maybe we wanted to do count X.
So Countx would count all of the rows in which the product standard key evaluates to a number or something that can equate to a number. So if I just check that you can see we got 15,000, 9030, 6000 rows. If I use a calculated atom together we get to our 60,398 items. Now suppose you wanted just to count the number of rows regardless of what the contents were in a particular key. Well then we would use the function Count Rows. So Count Rows just requires a table but it doesn’t require field because we’re just seeing if there is a particular role there regardless of what kind of information it is. And you can see that gives us the same answer in this particular example. So Related Table, wherever you see in this spreadsheet table name, don’t forget related Table can give you a table so you can use whatever you get from Related Table as a table table. Now notice what the description says. It evaluates a table expression in a context modified by the given photos. Very important last point which we’ll have a look.
- Context
In the last video we saw that the related table function evaluates a table expression in a context modified by the given photos. So what is context and why does it matter? Well, let’s go back to our previous example where we were using SUMX. So we got sum X of the related table Fact Internet sales, and we were looking at the product standard cost and we’re just going to change this into US dollars just to simplify the numbers. So here we have a number of 16 million. What is that number and what is the difference between this number and that formula and this formula where we just have some x of Fact Internet sales and the product standard cost.
What’s the difference between the other formula which we used related table and this formula? So here are the two formulas on the screen. Why don’t you pause the video and think what will the difference be? Well, the difference is that this second formula takes the entirety of the fact Internet sales table and sums the product standard cost. So that gives us for every single raw 17,277,000, whereas this formula takes the product category key. In this case, category one goes down this chain all the way to Fact Internet sales and then says, okay, for just those categories which were category number one, what is the total cost? But just for those items where the product category key is equal to one. Now that’s the sort of filtering. And this filtering is what we call in power bi, the context.
So here we’ve got the standard cost where the standard cost is for bikes, whereas here we’ve got it for the entirety. Now, this could be quite useful to have these two different formulas because what if we have a third formula, this is the percentage where the percentage is equal to the standard cost divided by this new column that we’ve just calculated. So format it as percentage. So this formula, which is the equivalent of this function, these two functions, one divided by each other, allows me to say, well, bikes are 97. 31% cost of the totality of the table, whereas clothing is 1. 8% and accessories 1. 52%.
Now, we have already been looking at context in our visualizations. If we go back to this visualization, you can see that we have a product standard cost of 16 million, but only where the product category name is bikes. Same for accessories, same for clothing. Now, if I change this visualization to a matrix and then add the order date, then we can see that the context changes again here. This five and a half million is for bikes with an order date in the year 2007. So hopefully you can see from this what A, context is all about and B, why it’s important. It’s the difference between having a figure of 16,000,001 being 17 million. It’s the difference between going through all the rows of a table and going through just relevant rows. So in Excel terms, a context is a bit like a filter.
- ALL
Two videos ago, we created this percent column. So this is in the category table and it’s added to every single row. So if we go to the visualization that we were looking at last time in the context and we add in the percent here, we can see the percent need to summit and there we are, 97% for bikes, one and a half percent for accessories, exactly as we’re expecting. So now, as this is a matrix, let’s add on to the columns the order date. So we break down this 97% into the various years, except it doesn’t, you can see that it’s 97% allegedly every single year, including years when there were no other things being ordered. So what’s going on? Well, the problem is that this is being added to each and every row.
This isn’t being added as a natural measure, it’s being added as a finished calculation. If we want it to be dynamic, if you want it not just to be in the context of this product category key, but any other context, like for instance, here, 2006, we’ve got to go from it being a column to it being a measure. So what do we want to do as a measure? Well, we want a measure that does this calculation. So we’ve got this sum of the related table and we want to divide it by the entirety. So let’s copy this. So I’m not recreating this new column from scratch.
We will delete all of these existing columns and we will add a new measure. So this measure will like the columns that were before it calculate the percentage. So here we have the percent of total and that is equal to the sum of x or sum x of the related table, divided by the sum of the entirety of the related table. Now, because it’s a measure, it doesn’t appear in each and every row, it’s something that is calculated on demand. So now when we go back to our visualization and put in percentage of total and I’ll just change it before we do, so that it’s percentage, the computer gives us 100%. Why is that? Well, the reason for that is because the context each time is in this case bikes. So we’re looking at the total of all of the product standard costs and dividing it by the sum x of fact internet sales, product standard cost for this particular item. So these two are in fact identical in this particular context. So is there a way in which we can say, hang on, I want you to forget context, I want you to take the entirety of the table regardless of the current context. And what there is is the word all.
So what all does is it removes context from the current equation. So you can see ignoring any filters that might have been applied. So we’ve got related table and we’ve got all. So let’s see what happens now. And now you can see that it works. We have 97% 1. 5% 1. 52. So it’s taking the related table, fact Internet sales and dividing by the entirety of fact Internet sales. So if we now go back and add the order date into the columns again, it all works. It’s taking the Internet sales in the context and dividing it by the totality. So does that mean that because we’re using it as a measure which has a context, do we actually need related table here? And in this particular case, no, we don’t, because the computer will take some acts based on the current context.
So let’s see what happens when we do this. And you can see exactly the same result. So we needed related table when we were going up and down the chain, and specifically down the chain when we were doing a column, because the column wasn’t taking advantage of the fact that it has a context, whereas here, in a measure, it always does have a context. And so if you want the context to be ignored, then you should use all. And here we can see all in the function spreadsheet returns all the rows in a table, ignoring any photos that might have been applied. This function is useful for clearing photos and creating calculations on all of the rows in the table, which is exactly what we’ve done. So all removes any existing context which is applied using other formulas.