A dataset is a collection of data from across your HubSpot account that can then be used in custom reports. A dataset can include properties for CRM objects and HubSpot assets, along with formulas to calculate your data as needed. For example, you can create a field to calculate annual recurring revenue based on the deal amount property.
Creating multiple datasets for your teams means that report creators won’t have to select their data sources every time they need to create a report. A dataset can also be updated after creation, making it possible to update all reports using that dataset at the same time.
Below, learn how to create a dataset, how to use the dataset in the report, and the definitions for available functions within the datasets tool.
To create a dataset:
You'll then be directed to the Join tab of the dataset builder.
First, select the data sources to include in your dataset. Data sources are the objects, assets, and events that you want to report on. This includes all CRM objects such as contacts or custom objects, and assets such as website pages and emails, conversations, sales activities, and more. You can select up to 5 data sources per dataset.
The primary data source will be the focus of the dataset, with all other data sources relating to that primary source. To connect these data sources, HubSpot joins the data in the background using the shortest possible path. Contacts and deals, for example, are directly related and can be selected with no extra joins needed.
However, other data sources cannot be associated directly and require additional sources to join the data together. For example, if your primary source is Deals and you want to include blog post data in the report, HubSpot can only link those sources through the Contacts and Web activities sources. These other sources will be selected automatically to join the data.
Select the fields to include in the dataset. You can add existing HubSpot properties to the dataset, as well as custom formula fields.
To add properties to the dataset, drag and drop the properties from the left sidebar into the Build your dataset section.
Formula fields are specific to the dataset and can be used to calculate values based on properties in the dataset. Learn more about building formulas using flexible expressions.
Conditional fields enable users to group or bucket data based set conditions. These fields can only be used in datasets or the custom report builder. You can use conditional fields to calculate different commissions based on the size of a deal or translate feedback responses in to categories, for example, a label between 1-6 is labeled as a detractor.
The conditional field allows you to build a formula using the IF() function. Any fields built using a conditional field can be recreated using an IF() function in a standard formula creation field.
Refine your data further by adding filters to your fields.
To add a filter:
Review your dataset before saving it.
You’ll then be brought to the report builder where you can create a report based on your dataset.
On the datasets dashboard, you can view and edit existing datasets.
From the datasets details page, you can also create a new report using the dataset by clicking Build report.
Once a dataset has been created, you can create a report based on the dataset either from the report builder or from the datasets tool.
Within a function, you can use data from properties and fields or literal data. Property and field data will be dynamic based on the individual data sources, while literal data is constant. For example:
2021-03-05
is a literal date, which is constant.[CONTACT.createdate]
is a property-based date, which is dynamic for each contact record.Functions can include both literal and property/field data, as long as the data types are compatible with the function’s required arguments. For example:
DATEDIFF("MONTH", "2021-01-01", “[CONTACT.createdate]”)
Below, learn about the syntax for literals and property/field data, and how to incorporate them into formulas.
Use literals to add specific, static strings of text, numbers, true or false values, and dates in your calculations.
“My cool string”
.42
.true
or false
."YYY-MM-DD"
, or a date timestamp number (e.g, 1635715904
). Property references allow you to directly include values from the properties of your selected data sources. You don't need to add the property as a dataset field to reference it.
Use the following syntax when referencing properties:
[
and ]
).[CONTACT.lifecyclestage]
[COMPANY.name]
[e_hs_scheduled_email_v2.__hs_event_native_timestamp]
You can reference fields in a formula by surrounding the field name with square brackets. For example:
[Field 1]
[My awesome custom field]
You can reference fields in formulas as long as the formula’s operators and functions accept the field’s data type. For example, if you create a new field that contains a string, you can reference the field in a function that accepts strings:
[DEAL.name]
, it contains a string value (the deal's name).CONCAT([Field 1], "Q4")
would be valid because it contains two string values.CONCAT([DEAL.name], 2012)
would not be valid because it contains both a string and a number value.You can use operators with literal and property/field values, and operators are evaluated in the standard PEMDAS order of operations. This allows you to nest operators using parentheses. For example:
1 + [DEAL.amount]
(1 + 2) * (3 + 4)
Operator | Description | Example usage |
|
Add numbers. Returns a number. |
|
|
Subtract numbers. Returns a number. |
|
|
Multiply numbers. Returns a number. |
|
|
Divide numbers. Returns a number. |
|
|
Negates a number. |
|
|
Checks if both of two boolean values are true. Returns a boolean. |
|
|
Check if either of two boolean values are true. Returns a boolean. |
|
|
Negatives a boolean value. Returns another boolean value. |
|
|
Equality operator. Returns a boolean value. |
=true = true |
IF logic is a set of rules performed if a certain condition is met. You can use IF logic to differentiate data. For example, you can use IF logic to:
The LABEL function converts enumeration property interval values into user-friendly values. Some HubSpot defined properties, such as Deal and Contact owner are represented as internal values. This make analysis difficult. When used with HubSpot's defined properties that support translation, the LABEL function will provide the translation based on portal settings, not user settings. For example, you can use the LABEL function to:
Access contact or deal stage names directly in fields
LABEL([DEAL.dealstage]) = "Closed Won"(10)
LABEL([DEAL.hubspot_owner_id]) = "John Smith"
Function | Definition | Arguments | Example usage |
|
Compute the absolute value of a number. Returns a number. |
number: the number to take the absolute value of. |
|
|
Round a decimal value up to the nearest integer. Returns a number. |
number: the number to take the ceiling of. |
|
|
Divide a number, but return zero when the divisor is zero. Returns a number. |
dividend: the number to use as a dividend in the division operation. divisor: the number to use as a divisor in the division operation, with zero resulting in a zero value overall |
|
|
Computer Euler's number raised to a value. Returns a number. |
exponent: the exponent to raise Euler’s number to. |
|
|
Round a decimal value down to the nearest integer. Returns a number. |
number: the exponent to raise Euler’s number to. |
|
|
Compute the natural logarithm of a value. Returns a number. |
number: the value to take the natural logarithm of. |
|
|
Compute the logarithm of a value within a specified base. Returns a number. |
base: the base to be used in the logarithm calculation of the value. value: the number to take the logarithm of. |
|
|
Raise a base value to a specified power. Returns a number. |
base: the number to compute the power of. exponent: the number to raise the base by. |
|
|
Take the square root of a non-negative number. Returns a number. |
number: the number to take the square root of. |
|
|
Bin numerical values into buckets of equal width. Returns the number of the bucket that the value falls into. If returned value is below the minimum, returns zero. If returned value is above the maximum, returns the number of buckets +1. |
value: the number to compute in the bin number. minValue: the minimum value to start binning from. maxValue: the maximum value to bin to. bucketCount: the desired number of equal-width buckets to bin values into between minValue and maxValue. |
|
Function | Definition | Arguments | Example usage |
|
Determine whether a string contains a case sensitive substring. Returns a boolean value. |
string: the string value to test. substring: the value to check for inside the string. |
|
|
Concatenate two strings. Returns a string. |
string1: the string value to which string2 will be appended. string2: the string value to append to string1. |
|
|
Compute the length of a string. Returns a number. |
string: the string value to calculate the length of. |
|
|
Remove leader and trailing whitespace from a string. Returns a string. |
string: the string value to trim whitespace from. |
|
Function | Definition | Arguments | Example usage |
|
Create a date value from year, month, and day parts. Returns a date. |
year: the year part of the desired date. month: the month part of the desired date. day: the day part of the desired date. |
|
|
Return the number of time units between the first date value and the second for a specified time unit. Returns a number. |
datePart: the year, quarter, month, week, or day unit to use in the difference calculation. date1: the starting date value to subtract from date2. date2: the ending date value that date1 will be subtracted from. |
|
|
Extract the year, quarter, month, week, or day from a date value. Returns a number. |
datePart: the year, quarter, month, week, or day unit to extract. date: the date value to extract a date part from. |
|
|
Truncate a date value to its year, quarter, month, week, or day. |
datePart: the year, quarter, month, week, or day unit to truncate to. date: the date value to truncate. |
|
|
Create a timestamp value from year, month, day, hour, minute, and second parts. Returns a datetime value. |
year: the year part of the desired date. month: the month part of the desired date. day: the day part of the desired day. |
|
|
Calculate the week number within a year for a date. Returns a number. |
date: the datetime value with which to compute the week number within a year. |
|
|
Returns the current time based on your account's timezone. Returns a datetime value. |
|
|
|
Returns the number of week days (Monday - Friday) between two dates. |
value1: the starting datetime value. value 2: the ending datetime value. |
|