Since the incremental refresh is already available with a Pro license, it was time for me to deal with this topic.
In the Power BI community, I found out that the known documentation describes how to set up an incremental refresh, but the examples are always based on a single query. In practice, however, combined queries from several subqueries are quite common.
At this point it should be noted that the incremental refresh is a function that you set up in Power Query and Power BI Desktop, but which is only used and controlled by data loading in the Power BI Service.
But first let’s look at how we can take advantage of the incremental refresh.
By using the two filter parameters RangeStart and RangeEnd, we can control which period of data should be loaded from the source. This reduces the loading time and ensures a low data volume that has to be transported to the Power BI Service.
On the other hand, we can determine which period of data should be kept in the analysis memory.
As we can see, there are two features that we need to consider when setting up incremental refresh.
- First, we have to use the RangeStart and RangeEnd filter parameters in the subqueries as early as possible in order to limit the queried data in the source and thus save loading time.
- Second, we have to use the filter parameters RangeStart and RangeEnd in the combined query to identify the data stock in the analysis memory that is to be retained.
I would like to illustrate it with a small example. In Business Central, the document information is stored in header and line tables, which we combine in a query for analysis.
In this example, we assume that we want to keep (save) the records with a posting date in the last 2 years and refresh records with a posting date in the last 7 days.
The following diagram shows how the Power Query Engine loads the invoice header and line data of the last 7 days from Business Central and merges them into one invoice. Then the query result, not the underlying basic data of the subqueries, is saved in the analysis memory by the Power BI Service. If data from a previous data load already exists for this period, it will be deleted. If data exceeds the retention period of 2 years, it will also be deleted. To identify the corresponding data records, the filter parameters RangeStart and RangeEnd must also be set in the query query result.
In Power Query, the sales invoice header and line subqueries have been disabled for loading to the report, and only the combined sales invoice query is loaded to the report.
Here, we first filter the subqueries sales invoice header and row, in order to retrieve only the required period from Business Central. To do this, we set a Date Filters > “Between…” in the PostingDate column.
If the column is of the “Date/Time” data type, we can use the RangeStart and RangeEnd parameters as filter values in the dialog. In this case, since the PostingDate column is of the “Date” data type, we first have to store fixed values.
We then replace the fixed values in the formula bar.
For this we convert the parameters RangeStart and RangeEnd with the function Date.From from Date/Time to Date.
You should not change the data type of the column to ensure that the data is filtered by the source and not by the Power Query Engine.
Then we filter the combined sales invoice query so that the analysis memory can identify which data records in the memory should be retained or replaced.
Now we can set up the incremental refresh in Power BI Desktop and publish the report to Power BI Service.
The “Incremental refresh” dialog is only displayed for the query (table) if the final query uses the RangeStart and RangeEnd parameters.
From now on, the Power BI Service takes over control via the RangeStart and RangeEnd parameters and automatically sets them for the first full load and the subsequent incremental refreshes.
IMPORTANT: Reports with an incremental refresh cannot currently be downloaded from the Power BI Service.