Preface
Every system has its limits. At 7pace, we place limits on the number of API requests that you and your team are allowed to ensure quality of service for all users. Please see API Requests Limits and Lifetime Policies for more information. This article contains techniques that may help you to achieve less frequent and more effective interaction with 7pace's API.
Incremental refresh
Incremental refresh is a great way to decrease the load on the API. Instead of repeatedly polling the entire history of data, only the data created, changed, or deleted during a specified period is requested (refreshed) from the API, while the data created, changed, or deleted before this specified period stays stored in the dataset and is no longer requested from the API. Data older than a specified archived period is removed from the dataset altogether. The archived period is usually set to years, while the refreshed period is set to days or single months (more on this later).
Image source: Microsoft documentation
How to set this up with 7pace Reporting API? Use 'EditedTimestamp' attribute of worklogs and only query data where 'EditedTimestamp' was changed within a specified period. The 'EditedTimestamp' attribute attribute is updated whenever a worklog is created, updated, or deleted (make sure you use API of version 3.3 or higher and utilize endpoints ending with 'WithDeleted' so that you get deleted worklogs in responses).
The example below shows how you can recognize various changes made on a specified date (August 8 in the example).
Incremental refresh may be implemented manually e.g. via using two reports (one for historical data, one for fresh data). The easiest way, though, is to use native Power BI functionality. We recommend reading the official Microsoft documentation first.
Incremental refresh via PowerBI
In this section, we'll show you how we connected Power BI to our data source and set up incremental refresh.
Prerequisites and recommendations
- You'll need Power BI Desktop together with a Power BI Pro license or higher. Data model and incremental refresh is set up on Power BI Desktop, but incremental refresh only works if a model is published as a service. See Power BI documentation for up to date information.
- 'OData feed' data source is more suitable for incremental refresh because it supports automatic pagination (via '@odata.nextlink' attribute in metadata). Therefore, we recommend using the 7pace reporting API.
- For optimal process, use version 3.3 or higher because it can also provide information about deleted worklogs. Why does it help? If you process information about deleted worklogs, you will be able to select shorter refresh periods because you will not be dependent on locking past weeks for editing; you will simply get all changes no matter how old the edited or removed worklogs were.
- If you decide to connect a data source as regular JSON content into Power BI (New Source -> 'Web'), you will have to implement pagination logic. This will most likely prevent query folding that is required for incremental refresh to work (more on this later).
Process
For incremental refresh to work, we need to:
- Set up a data source that will obtain data from the 7pace API for a certain date/time interval.
- Set up incremental refresh parameters for this data source.
- Publish the data source to a Power BI service and schedule its regular refresh interval.
Set up data source
First, we connect Power BI to the 7pace OData data source (see our 7pace documentation for our step-by-step guide). For optimal performance, we strongly recommend using API v3.3 or higher and utilizing endpoints providing deleted worklogs as well (they end with 'WithDeleted' suffix).
Once the data source is connected, we have to tell Power BI how to filter records from a specified refresh period. This can done by setting the parameters RangeStart and RangeEnd and updating our Power Query accordingly.
- We edit the query, e.g. via the context menu:
- We add RangeStart and RangeEnd parameters using the official Microsoft configuration guide. The parameters have to be in Date/Time format, otherwise the incremental refresh won't work. For the initial values, we set RangeStart in the past (i.e. years) and RangeEnd to the current date.
- Next, we choose a field that will be limited by the RangeStart and RangeEnd when obtaining data from the API. The field must contain a date and time. We recommend using the EditedTimestamp of a work log; even if a user creates a work log in the future or if they change an old record, EditedTimestamp will contain the system date of the time when the change took place.
OData source, however, stores timestamps in DateTimeZone format, so we have to create a function that will transform DateTimeZone to DateTime. Therefore, we create a blank query, call it e.g. DateKey, and paste the following code to the body:= (x as datetime) => DateTimeZone.From(x)
The result should look like this: - As the final step, we need to apply the filtering by RangeStart and RangeEnd to the field of our choice from the previous step with the DateKey conversion function applied to it. An easy way to do this is to first filter the field via some random value:
Next, we update the transformation step so it works with the RangeStart and RangeEnd parameters with DateKey function applied.
Note: Make sure the interval is closed on one side (that is, >=) and open on the other side (that is, <), otherwise the data in the report might get duplicated.
The final Power Query should look like this:let
Source = OData.Feed("https://{replaceByYourURL}.timehub.7pace.com/api/odata/v3.3-beta", null, [Implementation="2.0"]),
workLogsWorkItems_table = Source{[Name="workLogsWorkItemsWithDeleted",Signature="table"]}[Data],
#"Filtered Rows" = Table.SelectRows(workLogsWorkItems_table, each [EditedTimestamp] >= DateKey(RangeStart) and [EditedTimestamp] < DateKey(RangeEnd))
in
#"Filtered Rows" - Further transformations, if needed, can be done after the previously-mentioned steps, but only transformations that allow the query to be folded should be applied.
Now that the filtering is set, we should check if Power BI translates the Power Query to the API request properly. In other words, the filters need to be applied in the API calls directly, instead of Power BI downloading all data first and then applying the filters ex post.
- Incorrect: GET https://{sampleURL}.timehub.7pace.com/api/odata/v3.3-beta/workLogsWorkItemsWithDeleted
- Correct: GET https://{sampleURL}.timehub.7pace.com/api/odata/v3.3-beta/workLogsWorkItemsWithDeleted?$filter=EditedTimestamp ge 2022-06-01T00:00:00%2B02:00 and EditedTimestamp lt 2022-06-09T00:00:00%2B02:00
One option to verify this is to use a tool for capturing and debugging network traffic (WireShark, Fiddler); or we can use the Power BI built-in diagnostic tools as demonstrated, below.
- In the Power Query editor, we choose Tools -> Start Diagnostics.
- We refresh our query:
- In Tools, we Stop Diagnostics and check the Detailed results of the Query Section:
- We filter only rows with the Data Source Query column filled in and check the requests. In the example, below, the query folding works correctly; EditedTimestamp filter is used in the API call directly.
Troubleshooting - start with a base URL
What if you still cannot see filters applied in the URL? The issue may lie in how you set up the data source in the first place - for PowerBI query folding to work, you must start with the base URL (e.g. https://{sampleURL}.timehub.7pace.com/api/odata/v3.3-beta) and then choose the endpoint you wish to use as a subsequent step in PowerBI.
- Incorrect URL when setting up data source: https://{sampleURL}.timehub.7pace.com/api/odata/v3.3-beta/workLogsWorkItemsWithDeleted
- Correct URL when setting up data source: https://{sampleURL}.timehub.7pace.com/api/odata/v3.3-beta
The selection of a specific endpoint is represented by the second row of the Power Query example (the one starting with 'workLogsWorkItems_table').
let
Source = OData.Feed("https://{replaceByYourURL}.timehub.7pace.com/api/odata/v3.3-beta", null, [Implementation="2.0"]),
workLogsWorkItems_table = Source{[Name="workLogsWorkItemsWithDeleted",Signature="table"]}[Data],
#"Filtered Rows" = Table.SelectRows(workLogsWorkItems_table, each [EditedTimestamp] >= DateKey(RangeStart) and [EditedTimestamp] < DateKey(RangeEnd))
in
#"Filtered Rows"
Set up incremental refresh
After we're sure that Power BI is calling the correct API queries, we can go ahead and set up Incremental refresh of our data source.
The setting of incremental refresh attributes can vary per organization, depending on how much past data is needed for your reports (months or years), if you process deleted worklogs or not, and the frequency of your reporting.
An example of the setting can be:
- Archive period set to two (2) years, which means that data up to two years old will be a part of the dataset. Older data will no longer be kept in the dataset in Power BI (which means you won't see it in your reports, but it will still remain in 7pace database).
- Refresh period:
- Recommended: set to one (1) week in case you fetch all changes including deleted worklogs (i.e. you are using an endpoint ending with 'withDeleted'). This way, PowerBI will only fetch worklog changes (creation, modification, deletion) done during the past 7 days, while the rest of the data will stay untouched.
- Set to one (1) month if you don't process deleted worklogs so that you have more spece to catch changes. One month refresh period means that data creation/deletion/update within the past month will be reflected in your report. This also includes the update of records that are much older than one month, because EditedTimestamp is replaced with the current date in this case. Only deletion of records older than the refresh period will not be reflected.
ℹ️ The shorter the refresh period, the less data you obtain via API, the faster the response will be.
Note: The warning about query folding doesn't necessarily mean that the query folding won't actually work, it just means that Power BI was not able to verify the query. If Power BI called the URLs with the correct filters, as described in the previous section, and if a refresh can be scheduled after the dataset is published as a Power BI service (see below), the warning sign can be ignored.
❓Won't duplicate records be created this way?
Yes, if somebody updates a worklog older than your refresh period, you will get a new record via a refresh, while the original record created or updated before the refresh period will also be present in your data. We therefore recommend setting up a follow-up Power Query that will show only records with the latest EditedTimestamp according to their id. Deduplication is not in scope of this tutorial, but we can recommend this video as a guideline.
Publish to a service and schedule regular refresh
With the incremental refresh set, we need to Publish the dataset to a Power BI service. If any changes occur in the Power BI Desktop afterwards, the data model needs to be published again for the changes to take place in the service. We click on Publish:
In Power BI cloud, we then need to navigate to our dataset and refresh the data manually for the first time. You may need to enter credentials (this is described at the bottom of the article).
After the first refresh is done, we can schedule a regular refresh of the dataset.
Troubleshooting
When we experimented with the incremental refresh ourselves, we ran into several potential issues, which we will share with you, below, so you can hopefully troubleshoot or avoid them altogether.
- The refresh fails / asks for credentials, if Power BI doesn't know the given API yet. Try updating the credentials in the settings of the data source and try again.
- Datasets with dynamic data sources cannot be refreshed from the service (although they seem to work fine in Power BI Desktop). You have to rewrite your Power Query:
- If using OData.feed in your PowerQuery, as is recommended for 7pace API, remember that you have to obtain the feed via the base URL (e.g. https://{sampleURL}.timehub.7pace.com/api/odata/v3.3/workLogsWorkItemsWithDeleted) without parameters and, subsequently, filter results via Table.SelectRows as described in this post. You shouldn't encounter this error when following this guide.
- If using Web.Contents in your Power Query, see this blog post for more details. However, you might encounter the 'dynamic data source' issue when implementing pagination anyway.
Final thoughts
We recommend verifying the refresh status of the PowerBI service the following morning.
Template
We prepared a Power BI template file - 7pace Incremental Load Example.pbit - that contains a simple example of Incremental Refresh. You just need to input initial start and end dates, change URL in the datasource and enter your credentials.
Incremental refresh in Excel
Currently, Excel does not support incremental refresh.
We recommend setting up a service in Power BI (see above) and then using the result as a data source in Excel to save API calls.
Incremental refresh in other tools
Incremental refresh can be achieved in the other tools as well, but you might need to keep two separate reports (one with stable data and one with fresher 'delta' data).
If you are used to loading 7pace Timetracker data to your database and evaluating the changes there, you might miss deleted records (because they are currently deleted for good, without being indicated via some 'delete' flag in the API responses). In this case, we welcome you to upvote our feature request for webhooks.
Comments
0 comments
Please sign in to leave a comment.