Incremental data loads in Microsoft Power Query for Excel

Technical Value

Power Query is a great and flexible tool for getting and transforming data from different sources into Excel or Power Pivot. The standard procedure for Power Query is to read a full table and to replace the destination with the result of the current query. This is exactly what we need for most use cases of Power Query. However, if you like to add new data to existing data you can still use Power Query but you have to follow a slightly different approach.

For example, let’s say we want to create a list of the blog posts while the RSS feed only delivers the most recent posts. In this case we would need to add the the query results to the existing data.

Let’s start with my olap blog. The RSS feed delivers only the last 25 entries of my blog. In order to load this with Power Query, I’m using the load from web function:


The dialog asks for the URL (I’m using my blog’s mirror here at http://ms-olap.blogspot.com/feeds/posts/default)


If Power Query does not automatically detect this as an XML table you can insert the XML.Tables(…) function as shown below:


The items are stored in the last column (entry table) which can be expanded by clicking on image button right between “entry”. For my example I’m only interested in the publication data and the title.


By expanding this table we get one row per blog post. In order the get the title, I’m expanding the last column (in my case title.1 since there is already a column title for the title of the blog) to its text value:


Finally, I changed the type of the first column to DateTime and I renamed the columns as shown below:


The full script generated by Power Query so far looks like this:

  1. let      
  2.     Source = Xml.Tables(Web.Contents("
  3. http://ms-olap.blogspot.com/feeds/posts/default"))
  4. ,      
  5.     #"Expand entry" = Table.ExpandTableColumn(Source, "entry", {"published", "title"}, {"published", "title.1"}),      
  6.     #"Expand title.1" = Table.ExpandTableColumn(#"Expand entry", "title.1", {"Element:Text"}, {"Element:Text"}),      
  7.     #"Removed Columns" = Table.RemoveColumns(#"Expand title.1",{"id", "updated", "category", "title", "subtitle", "link", "author", "generator", "
  8. http://a9.com/-/spec/opensearchrss/1.0/"})
  9. ,      
  10.     #"Split Column by Position" = Table.SplitColumn(#"Removed Columns","published",Splitter.SplitTextByPositions({0, 19}, false),{"published.1", "published.2"}),      
  11.     #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Position",{{"published.1", type datetime}, {"published.2", type text}}),      
  12.     #"Removed Columns1" = Table.RemoveColumns(#"Changed Type",{"published.2"}),      
  13.     #"Renamed Columns" = Table.RenameColumns(#"Removed Columns1",{{"published.1", "Date"}, {"Element:Text", "Title"}})      
  14. in      
  15.     #"Renamed Columns"

We can now load this result to the Workbook by clicking the “Close & Load” button. Now, here comes the interesting part. Up till now, the RSS feed only delivers the last 25 rows, so whenever there is a new post, the oldest post would be removed from the table. Let’s say we’d like to add new posts to the table.

To do so, I’m creating a new query named “Existing data” from the data we just loaded into the workbook:


Using the “Save & Load To …” function in the ribbon, we can choose to only create a connection (no data is actually moved here):


The workbook queries pane now looks like this:


We can now edit the first query (“BlogPosts”) again. In the home tab of the Power Query ribbon you can find the options for combining queries.


In my case, I simple decide to append the new data to the existing one. If you have a key to match the same rows, you could also do a merge based on the key column (for example if there are updated rows).


The result is that we now have 25 rows from the existing table plus 25 rows recently read from the blog. At this point of time, the entries are identical so we have 25 duplicates. I remove the duplicated rows here using the “Remove duplicates” function.


We can now save & load the query again. In order to demonstrate the effect, I’m pasting the remaining blog posts (which cannot be pulled by the RSS feed) into the Excel table:



While full loads usually are the most usefull approach for loading data with Power Query, you can also build incremental load procedures by joining or appending the new data to the existing data.


Chang Liu
Di, 01.05.2018 - 23:30

The data pointer here is wrong and needs to be addressed as such because the pointer points to the data that is being updated and will update itself. When doing the comparison with the new query, the two items will evaluate to be equal after it reaches the maximum number of entries of the first dataset.

You cannot append to an existing query in an existing table that has a query on it in the same exact space.

To get around all this and do the increment data you need to point the Connection query to the Output Query so it compares the Output Query to the Input query and adds additional values.

Hope this helps those who couldn't get the exact results as above.

Hilmar Buchta
Mi, 02.05.2018 - 18:50

Hi Chang, First let me say that the example from above worked exactly as I described in the post (I took all screen shots from the working example).
Maybe there is a confusion about the query for the existing data. This query has to be created based on the Excel-sheet (not based on the other query). This is what I meant by „To do so, I’m creating a new query named “Existing data” from the data we just loaded into the workbook“.

The Code for the ExistingData query is

Source = Excel.CurrentWorkbook(){[Name="BlogPosts"]}[Content]

The complete Code for the BlogPost Query (including the merge) was

Source = Xml.Tables(Web.Contents("http://ms-olap.blogspot.com/feeds/posts/default")),
#"Expand entry" = Table.ExpandTableColumn(Source, "entry", {"published", "title"}, {"published", "title.1"}),
#"Expand title.1" = Table.ExpandTableColumn(#"Expand entry", "title.1", {"Element:Text"}, {"Element:Text"}),
#"Removed Columns" = Table.RemoveColumns(#"Expand title.1",{"id", "updated", "category", "title", "subtitle", "link", "author", "generator", "http://a9.com/-/spec/opensearchrss/1.0/"}),
#"Split Column by Position" = Table.SplitColumn(#"Removed Columns","published",Splitter.SplitTextByPositions({0, 19}, false),{"published.1", "published.2"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Position",{{"published.1", type datetime}, {"published.2", type text}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type",{"published.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns1",{{"published.1", "Date"}, {"Element:Text", "Title"}}),
Append = Table.Combine({#"Renamed Columns",#"Existing data"}),
#"Removed Duplicates" = Table.Distinct(Append, {"Date"})
#"Removed Duplicates"

This means that the source combines the new rows with the data in the Excel table and then loads the results into the same Excel table again.
As in my previous comments this is also the reason why this only works for Power Query in Excel (not for Power BI because it has no separate storage like an Excel-sheet).

Hilmar Buchta
Sa, 07.11.2015 - 19:48

In order to build a query based on the output of an existing query in Power BI Desktop you just create a new (blank) query and type =<Name of the existing query> in the command line for the first step. You should then get the output of the existing query.

Giovanni Miraglia
Di, 15.03.2016 - 17:02

I also can't reproduce this into Power-BI Desktop. It seem that there is a difference between Excel and Power-BI. Can you upload such a description or a demo.pbix file showing that you can do incremental uploads in Power-BI?
Thanks in advance!

Hilmar Buchta
Di, 15.03.2016 - 17:20

Hi Giovanni,
The approach from above only works with Power Query in Excel (Power BI 1.0), not in Power BI Desktop (Power BI 2.0) although there are a lot more features in Power BI Desktop compared to the Excel version.
However, the main trick I'm using here is to load the data into a seperate "storage" (the Excel sheet) which is not based on a query. In Power BI Desktop we don't have this separate storage and therefore I don't see any way how to do this in Power BI Desktop. I you try to load into the same query you're getting the cyclic reference error that was mentioned by Andrey. One option would be to implement the incremental load as an Integration Services Package and store the data in a data base table.
Hope this helps,

Do, 11.04.2019 - 12:38

In power bi desktop you can add a R script step to save the data, using the command write.csv(dataset,folder//filename). Then you only have to append every time the query to this csv and save it again.

Andrey Metelski
Do, 19.11.2015 - 17:32

"In order to build a query based on the output of an existing query in Power BI Desktop you just create a new (blank) query and type = in the command line for the first step. You should then get the output of the existing query."
We can do it, but it does not work when we try to append Existing Data in BlogPosts query. In Excel we have "Expression.Error: A cyclic reference was encountered during evaluation.". Power BI simply does not allow to choose Existing Data to append. Any thoughts?

Sa, 07.11.2015 - 17:35

How would you do this in PowerBI Desktop, where you can't load to excel? I have tried creating a calculated table off a loaded query, but then I can't figure out how to build a new query that is based off of that calculated table.

Neuen Kommentar schreiben

Der Inhalt dieses Feldes wird nicht öffentlich zugänglich angezeigt.


  • Keine HTML-Tags erlaubt.
  • HTML - Zeilenumbrüche und Absätze werden automatisch erzeugt.
  • Web page addresses and email addresses turn into links automatically.
Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden