Top Tips for future-proofing data within Domo

Marquin Smith - Wednesday, April, 24, 2019

Domo is a powerful enterprise level visualization tool. Domo differs from many other players in the field in that they incorporate data collection, storage and transformation into a single platform.

For a plethora of data sources, you can simply arm Domo with your credentials and Domo does the work of authenticating and pulling the data in, ready for you to visualize.

From this point you can schedule Domo to collect fresh data every day, week, month or year – however often the data is required.

In an ideal universe, that would be that and we would move on with our lives. However, it is rarely that simple. Things break, connections fall over and credentials change; all of which might mean your reporting does not get updated with the latest data as expected. After getting your stakeholders hooked on the steady stream of relevant data, they will be stalking your desk for the next update if things start to go wrong.

The following are some tips that can be employed within Domo to help minimize disruption in the data flow to your stakeholders:

Auto Retires

Each connection to an external API service will have some scheduling options associated with it.

Within these settings you can specify how often and how many times the report should retry to get the data before giving up.

Panalysis recommends setting the retry settings to the following:

Retry every 15 minutes up to 10 times.

This setting helps to mitigate instances where the connection or the service is temporarily down.

Historical Intermediate Dataset

This tip is best explained with a hypothetical scenario.

You have just received a brief that requires data from 1st January 2017 to current day.

One option is to set the API connection to pull the entire date range each time:

However, this may not be very efficient as it collects the same data repeatedly. Depending on how much data exists in your time period, it may take a long time to collect all the data.

An alternative approach would be to only collect the latest day’s worth of data, and add that to the bottom of the data with the following settings:

With this configuration you may end up with duplicate entries. Also, if this dataset is accidently run with the “Replace” update setting all historical data collected will be lost.

A more robust implementation of this strategy would be to update the connection data one day at a time but then to append this data to an ETL dataset. To do this you need to set up what is known as a recursive dataflow. This means that the output of the ETL is used as an input the next time the ETL is performed.

To set up a recursive dataflow:

  1. Initialize output dataset.

2. Run the ETL so the output dataset actually exists.

3. Re-open the ETL and add the output dataset as an input

This will raise a warning that a dataset with that name already exists in the ETL.
4. Connect the input data together, usually with an append.

The append function will stack the data one on top of each other.
The Remove Duplicates will remove duplicated entries based on the values in one or more column. In this case we don’t want duplicate days being pulled for view so we can set the remove duplicates as follows:

5.  Set the ETL to run every time the Google analytics connection collects the data.

Now with the connection dataset only pulling the latest day, this intermediate data set will append only the data it doesn’t already have.

This strategy allows more flexibility with the initial connection dataset, should the need arise. You can re-pull certain date ranges of data if you need to and don’t have to worry about any duplicates flowing through to your visualizations.

Set Column Types

Even with all the API powered technology that Domo provides, it may still be necessary to upload manually updated data into Domo’s systems. This can include .csv files, Excel files or even Google Sheets.

These tools are great because of their flexibility but can cause some hiccups within Domo. The most common issues arise because Domo expects columns to consist of a single data type, whereas Excel and Google Sheets do not share this constraint.

For example, if an Excel file of sales data is uploaded containing a column for revenue, I may want to work out the tax (GST) on this revenue. Within Domo I can create a calculated column which is 10% of the revenue value. If someone enters a date, some text, or a number as a text string into the revenue column of the spreadsheet, the tax calculation within Domo will fail.

Using the set column types transformation to force the revenue column to a decimal is best practice to get this column into the correct format. This also has the added bonus of creating a break point within the ETL, ensuring that no downstream visualizations experience any unwanted effects.

With these 3 tips in place you will have a much more robust business intelligence tool, which is more resilient to a wide range of common issues. This allows stakeholders to have a more reliable stream of relevant data.

Get in Touch

Panalysis is an official Domo partner.

Reach out to sales@panalysis.com if you would like to know more about how Panalysis can help you with an existing or potential instance of Domo.

Comment

Your email address will not be published. Required fields are marked *

Search