Handling Paginated REST API Results with Power Query

Querying a RESTful API, ingesting its response, and transforming the data through Power Query is a powerful capability. In many instances, the integration between Power Query and Power BI or Power Query and Excel further simplifies the process by eliminating the need for other tools and scripts to perform intermediate processing of a JSON response. However, working with REST APIs is not always straightforward as there can be complexities with larger data sets. To limit the processing and bandwidth burden on an API, a response may be paginated to limit the number of results returned for any given query. In other words, if a particular query is expected to return a million entries, then the API may paginate those million entries into blocks of 100 entries per page. To retrieve the full million entry result set would require ten thousand separate API requests to retrieve each 100-entry page (100 * 10,000 = 1,000,000).

The purpose of this post is to demonstrate Power Query code that can handle paginated JSON results from a REST API. Three solutions are presented to handle API request scenarios where the number of pages is known, calculated, or unknown.

REST API

The code presented in this solution uses a publicly available REST API from the U.S. Department of the Treasury named the Treasury Reporting Rates of Exchange. The intent of this API is to provide the U.S. government’s authoritative quarterly exchange rates for converting foreign currency units and U.S. dollar equivalents. I chose this particular API because the basic concept of an exchange rate is commonly understood albeit entirely unimportant for the purposes of this exercise. More importantly, this API exposes a significant volume data needed to demonstrate pagination. I recommend exploring the full U.S. Treasury Fiscal Data site as there are quite a few data sources which are well-documented, free-to-use without requiring a sign-up or API token, and they provide a great opportunity to learn how to interact with APIs as well as develop analytics using real-world data.

The API query used in this post is straightforward. The three data fields requested from the Rates of Exchange API are Country – Currency Description, Exchange Rate, and Record Date. As illustrated in the screenshot below, the JSON response contains some meta data including count, total-count, and total-pages. For this demonstration, these particular fields and their corresponding values are important to note for final verification that the Power Query code is working as expected. The count field corresponds to the number of entries returned, the total-count field corresponds to the total number of entries in the result set, and the total-pages field corresponds to the total number of paginated results, e.g., 100 maximum results across 162 total pages (page index 162 is a partial page with only 10 entries).

API JSON Results
API JSON Results

Custom Power Query Paging Function

According to the documentation for this API, there are two defined parameters with respect to response pagination. The page[size]= parameter sets the number of entries returned in the response. The default response for page[size]= is 100 entries per page. This example uses the default setting. The more important parameter for this demonstration is the page[number]= parameter as it will be used to iterate through each page number. The page[number]= parameter sets the pagination index which is used to return a particular page from the resulting data set. As an example, if the resulting data set has 42 total pages and the page[number]= parameter is set to 7, then the response will return the block of 100 entries associated with page 7.

Function Source Code

We will need to create a custom Power Query function to process a dynamic number of pages returned by an API. This function defines one parameter named page which represents the page number to retrieve and it is used in the API query request. This function will be used in each of the three solutions.

  1. From Power BI or Excel (or any other client application that supports Power Query), select Get data and then Blank query.
  2. The Power Query window opens to a new blank query.
  3. From the Power Query Home menu, select Advanced Editor.
  4. Paste the following code into the Advanced Editor replacing any pre-existing default code.
  5. Click Done to close the Advanced Editor window.
  6. In the Query Settings pane under Properties, rename the function to GetPage.
(page as text) =>
let
    Source = try Json.Document(Web.Contents("https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?page[number]=" & page & "&fields=country_currency_desc,exchange_rate,record_date")) otherwise [data=null, meta=null, links=null],
    #"Converted to Table" = Table.FromRecords({Source}),
    #"Removed Columns" = Table.RemoveColumns(#"Converted to Table",{"meta", "links"}),
    #"Expanded data List" = Table.ExpandListColumn(#"Removed Columns", "data"),
    #"Expanded data Records" = Table.ExpandRecordColumn(#"Expanded data List", "data", {"country_currency_desc", "exchange_rate", "record_date"}, {"data.country_currency_desc", "data.exchange_rate", "data.record_date"})
in
    #"Expanded data Records"
Power Query – Advanced Editor GetPage Function
Power Query – Advanced Editor GetPage Function

Function Test

  1. To test the function, click GetPage on the Queries pane.
  2. The Enter Parameter screen is displayed with a text field corresponding to the page parameter.
Power Query – GetPage Function
Power Query – GetPage Function
  1. Enter a valid page number index. Invalid values will return a null table.
  2. Click the Invoke button.
Power Query – GetPage Function Test Value
Power Query – GetPage Function Test Value
  1. The Invoked Function returns a table containing the Country – Currency Description, Exchange Rate, and Record Date entries from the supplied page number index as expected.
Power Query – Invoked GetPage Function Test
Power Query – Invoked GetPage Function Test

At this point we have a function that is able to retrieve a single indexed page within the result set. We can now use this function to iterate through each indexed page to dynamically retrieve and aggregate all of the paginated results.

Retrieving All Pages

This particular API conveniently returns the total number of pages as part of its JSON response which makes page retrieval fairly easy. Many other APIs are not as helpful. Some APIs may only return the total number of entries in the response. Other APIs may not return any counts. For those less accommodating APIs, I have provided alternate solutions to either calculate pages or retrieve pages dynamically without meta counts.

  1. From the Power Query Home ribbon, select New Source and then Blank Query.
  2. The Power Query window opens a new blank query.
  3. From the Power Query Home menu, select Advanced Editor.
  4. Paste one of the following code sequences into the Advanced Editor replacing any pre-existing default code.
  5. Click Done to close the Advanced Editor window.
  6. In the Query Settings pane under Properties, rename the function to Treasury Reporting Rates of Exchange.

Method 1: Number of Pages is Known Upfront

In the easiest scenario, the API response includes a count of the total number of pages. No additional calculation is needed.

let
    Source = try Json.Document(Web.Contents("https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date")) otherwise [data=null, meta=null, links=null],
    #"Page List" = if Source = null or Source[meta] = null then {null} else {1..Source[meta][#"total-pages"]},
    #"Converted to Table" = Table.FromList(#"Page List", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "PAGES"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"PAGES", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "GetPage", each if [PAGES] <> null then GetPage([PAGES]) else null),
    #"Expanded GetPage" = Table.ExpandTableColumn(#"Invoked Custom Function", "GetPage", {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"})
in
    #"Expanded GetPage"
Power Query – Retrieve Full Result Set When Number of Pages Is Known
Power Query – Retrieve Full Result Set When Number of Pages Is Known

Method 2: Number of Pages is Unknown; Number of Entries is Known

If the number of pages is unknown because the API doesn’t expose it in the meta data, then the page indexes can be calculated using other available values. If the API returns the total number of entries in the result set and the number of entries per response is known, then the number of pages in the paginated results can be calculated using simple division and rounding up. The bulk of this code is the same as above. The relevant change is found in step #”Page List”. In this step the dynamic list generation is changed to {1..Number.RoundUp(Source[meta][#"total-count"] / Source[meta][#"count"])} which calculates the number of pages by dividing the total-count (number of entries in the result set) by count (number of entries per page). The Number.RoundUp function is then used to account for partial pages by rounding the remainder up to the next integer. In other words, if the total number of entries in the result set is not evenly divided by the total entries per page, then the remainder will be on a partial page. As an example, if there are 205 total entries with 100 entries per page, then the number of pages should be 3 to account for the 5 entries above 200. This calculation would need to be adjusted if the API pagination started at zero or some other number.

let
    Source = try Json.Document(Web.Contents("https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date")) otherwise [data=null, meta=null, links=null],
    #"Page List" = if Source = null or Source[meta] = null then {null} else {1..Number.RoundUp(Source[meta][#"total-count"] / Source[meta][#"count"])},
    #"Converted to Table" = Table.FromList(#"Page List", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "PAGES"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"PAGES", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "GetPage", each if [PAGES] <> null then GetPage([PAGES]) else null),
    #"Expanded GetPage" = Table.ExpandTableColumn(#"Invoked Custom Function", "GetPage", {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"})
in
    #"Expanded GetPage"
Power Query – Retrieve Full Result Set When Number of Pages Is Unknown
Power Query – Retrieve Full Result Set When Number of Pages Is Unknown

Method 3: Both Number of Pages and Number of Entries are Unknown

In this scenario, which is the most complex of the three, both the total number of pages nor the total number of entries are unknown. We need to use entirely different logic since the API does not expose any meta data with respect to result counts. The function List.Generate is used to iterate through pages until a null page is retrieved, i.e., the page contains an entry with null values. This condition depends on the custom GetPage function returning null when a indexed page number does not return any results. The conditional should be adjusted to the particular API.

let
    Source = List.Generate(() => [PAGE = 1, RESULTS = GetPage("1")], each Table.RowCount(Table.SelectRows(_[RESULTS], each not List.Contains(Record.ToList(_), null))) > 0, each [PAGE = _[PAGE] + 1, RESULTS = GetPage(Number.ToText([PAGE] + 1))], each _[[PAGE], [RESULTS]]),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"PAGE", "RESULTS"}, {"PAGES", "GetPage"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Column",{{"PAGES", type text}}),
    #"Expanded GetPage" = Table.ExpandTableColumn(#"Changed Type", "GetPage", {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"})
in
    #"Expanded GetPage"
Power Query – Retrieve Full Result Set When Both Number of Pages and Number of Entries are Unknown
Power Query – Retrieve Full Result Set When Both Number of Pages and Number of Entries are Unknown

Code Explanation

Method 1 and Method 2

Step 1 – Source

The first step requests the API as described above. The purpose of this step is to retrieve the count and total-count from the meta record. Inspecting those values confirms the expected result found in the earlier screenshot of the direct API request. It’s worth noting that the API request is wrapped in a try...otherwise statement. If the API request fails, then null values are returned by default for the data, meta, and links records. This helps prevent the overall Power Query query and any downstream Power Query queries from failing if the API request is unsuccessful for any number of reasons, e.g., network issues, server issues, etc.

Power Query – Methods 1 & 2 – Query Step 1 – Source
Power Query – Methods 1 & 2 – Query Step 1 – Source

Step 2 – Page List

Using the value of total-count, the next step generates a list of page indexes starting from 1 and ending with the total number of pages. This particular API always begins with a page index numbered 1, but other APIs may start with a different value. This step also includes a basic error check for null values returned from a failed API request.

Power Query – Methods 1 & 2 – Query Step 2 – Page List
Power Query – Methods 1 & 2 – Query Step 2 – Page List

Step 3 – Converted to Table

The dynamic list generated in the previous step is converted to a table containing a single column of page indexes.

Power Query – Methods 1 & 2 – Query Step 3 – Converted to Table
Power Query – Methods 1 & 2 – Query Step 3 – Converted to Table

Step 4 – Renamed Columns

The single column in the table of page indexes is renamed to PAGES.

Power Query – Methods 1 & 2 – Query Step 4 – Renamed Columns
Power Query – Methods 1 & 2 – Query Step 4 – Renamed Columns

Step 5 – Changed Type

The PAGES column is changed to a text data type.

Power Query – Methods 1 & 2 – Query Step 5 – Changed Type
Power Query – Methods 1 & 2 – Query Step 5 – Changed Type

Step 6 – Invoked Custom Function

With a list of valid page indexes, we can now call the custom GetPage function to retrieve the entries associated with a particular page from the API. From the Add Column ribbon, select Invoke Custom Function to call GetPage using the values found in the PAGES column as the argument. For each value in the PAGES column, the GetPage function is invoked and the API is requested. The new GetPage column contains a table of entries from each indexed page API request.

Power Query – Methods 1 & 2 – Query Step 6 – Invoked Custom Function
Power Query – Methods 1 & 2 – Query Step 6 – Invoked Custom Function

Step 7 – Expanded GetPage

Expanding the tables found in the GetPage column created in the prior step displays all of the expected entries.

Power Query – Methods 1 & 2 – Query Step 7 – Expanded GetPage
Power Query – Methods 1 & 2 – Query Step 7 – Expanded GetPage

Results

After returning the results from Power Query to Power BI, a simple table visualization is used to confirm that the number of pages and the number of entries match the total counts from the original API request and the code is working as expected.

Power BI – Methods 1 & 2 – Result Counts
Power BI – Methods 1 & 2 – Result Counts

Method 3

Step 1 – Source

Unlike the other methods, the total number of pages and the total number of entries are unknown. This code statement assumes the API page indexes begin at number 1. The List.Generate function uses a conditional to check for any null values in returned records. Given the way the GetPage function is coded, there should not be any null values unless the page returns an empty result set.

Power Query – Method 3 – Query Step 1 – Source
Power Query – Method 3 – Query Step 1 – Source

Step 2 – Converted to Table

The interim list of records is converted to a table of records in one column named Column1.

Power Query – Method 3 – Query Step 2 – Converted to Table
Power Query – Method 3 – Query Step 2 – Converted to Table

Step 3 – Expanded Column

The records are expanded into a column for page indexes named PAGES and a column containing tables of results per page named GetPage.

Power Query – Method 3 – Query Step 3 – Expanded Column
Power Query – Method 3 – Query Step 3 – Expanded Column

Step 4 – Changed Type

The PAGES column is changed to the text data type.

Power Query – Method 3 – Query Step 4 – Changed Type
Power Query – Method 3 – Query Step 4 – Changed Type

Step 5 – Expanded GetPage

Expanding the tables found in the GetPage column created in the prior step displays all of the expected entries.

Power Query – Method 3 – Query Step 5 – Expanded GetPage
Power Query – Method 3 – Query Step 5 – Expanded GetPage

Results

After returning the results from Power Query to Power BI, a simple table visualization is used to confirm that the number of pages and the number of entries match the total counts from the original API request and the code is working as expected.

Power BI – Method 3 – Result Counts
Power BI – Method 3 – Result Counts