Incremental refresh with subqueries

Since the incremental refresh is already available with a Pro license, it was time for me to deal with this topic.

In the Power BI community, I found out that the known documentation describes how to set up an incremental refresh, but the examples are always based on a single query. In practice, however, combined queries from several subqueries are quite common.

At this point it should be noted that the incremental refresh is a function that you set up in Power Query and Power BI Desktop, but which is only used and controlled by data loading in the Power BI Service.

But first let’s look at how we can take advantage of the incremental refresh.

By using the two filter parameters RangeStart and RangeEnd, we can control which period of data should be loaded from the source. This reduces the loading time and ensures a low data volume that has to be transported to the Power BI Service.
On the other hand, we can determine which period of data should be kept in the analysis memory.

Parameters RangeStart and RangeEnd

As we can see, there are two features that we need to consider when setting up incremental refresh.

  • First, we have to use the RangeStart and RangeEnd filter parameters in the subqueries as early as possible in order to limit the queried data in the source and thus save loading time.
  • Second, we have to use the filter parameters RangeStart and RangeEnd in the combined query to identify the data stock in the analysis memory that is to be retained.

I would like to illustrate it with a small example. In Business Central, the document information is stored in header and line tables, which we combine in a query for analysis.

In this example, we assume that we want to keep (save) the records with a posting date in the last 2 years and refresh records with a posting date in the last 7 days.

Incremental refresh Dialog

The following diagram shows how the Power Query Engine loads the invoice header and line data of the last 7 days from Business Central and merges them into one invoice. Then the query result, not the underlying basic data of the subqueries, is saved in the analysis memory by the Power BI Service. If data from a previous data load already exists for this period, it will be deleted. If data exceeds the retention period of 2 years, it will also be deleted. To identify the corresponding data records, the filter parameters RangeStart and RangeEnd must also be set in the query query result.

Diagram Incremental refresh

In Power Query, the sales invoice header and line subqueries have been disabled for loading to the report, and only the combined sales invoice query is loaded to the report.

Query Dependencies

Here, we first filter the subqueries sales invoice header and row, in order to retrieve only the required period from Business Central. To do this, we set a Date Filters > “Between…” in the PostingDate column.

Date Filters Between for Subquery

If the column is of the “Date/Time” data type, we can use the RangeStart and RangeEnd parameters as filter values in the dialog. In this case, since the PostingDate column is of the “Date” data type, we first have to store fixed values.

Date Filter dialog filter parameters

We then replace the fixed values in the formula bar.
For this we convert the parameters RangeStart and RangeEnd with the function Date.From from Date/Time to Date.
You should not change the data type of the column to ensure that the data is filtered by the source and not by the Power Query Engine.

RangeStart and RangeEnd in the formula bar

Then we filter the combined sales invoice query so that the analysis memory can identify which data records in the memory should be retained or replaced.

combined query filter for incremental refresh

Now we can set up the incremental refresh in Power BI Desktop and publish the report to Power BI Service.
The “Incremental refresh” dialog is only displayed for the query (table) if the final query uses the RangeStart and RangeEnd parameters.

set up incremental refresh

From now on, the Power BI Service takes over control via the RangeStart and RangeEnd parameters and automatically sets them for the first full load and the subsequent incremental refreshes.

IMPORTANT: Reports with an incremental refresh cannot currently be downloaded from the Power BI Service.

Sources:

Veröffentlicht von

Marcus Wegener

Marcus Wegener

Marcus Wegener ist Anwendungsentwickler für Business Intelligence und erstellt Lösungen, mit denen sich große Datenmengen schnell analysieren lassen. Kunden nutzen seine Lösungen, um die Vergangenheit zu analysieren, die Gegenwart zu steuern und die Zukunft zu planen, um damit mehr Erfolg zu generieren. Dabei ist seine einzigartige Kombination aus Wissen und Auffassungsgabe ein Garant für ihren Erfolg.

4 Gedanken zu „Incremental refresh with subqueries“

    1. Hi Frank,
      the goal is to filter the data in the source as efficiently as possible.
      Without this filter, all data records are loaded from the source and later filtered in Power Query.

      without Filter

      The filter only loads relevant data records from the source.

      with Filter

      This saves loading and processing time.
      Best regards
      Marcus

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

I accept that my given data and my IP address is sent to a server in the USA only for the purpose of spam prevention through the Akismet program.More information on Akismet and GDPR.