Internal variables in Assette are expressions defined within a Data Block to compute values based on existing data. They serve as intermediate or final outputs that enhance the logic of a data transformation, enabling more flexible, dynamic, and reusable data workflows.
Internal variables can be defined within any Data Block but are most commonly used with Transform Data Blocks. Internal Variables are commonly used to calculate summaries, flags, conditions, or derived metrics without modifying the source data.
Scope of Internal Variables #
Assette supports two types of internal variable scope, each suited to different use cases:
- Row Scope:
Row-level variables are calculated individually for each row in the dataset. These are useful for expressions that depend on the values of a single record (e.g.,Revenue > 100000
). - Global Scope:
Global-level variables compute a single value across the entire dataset (e.g., sum, average, count). These are commonly used for aggregations, benchmarks, or global flags.
The scope is defined when setting up the internal variable and determines how the result is computed and used in further logic.
Operations and Type Assignment #
When defining an internal variable, Assette requires you to select an operation (e.g., Sum, Count, Max) and a target column. Based on these selections, the platform automatically assigns the correct variable type. This behavior ensures predictable results and removes the need for manual type configuration. This automatic type assignment ensures that variable behavior is aligned with the data and logic applied, simplifying setup and reducing errors. See Internal Variable Type Assignment for more details.
Use Cases for Internal Variables #
Internal variables are useful in a wide range of data scenarios, including:
- Calculating derived fields such as averages, totals, or ratios
- Setting conditional flags for filtering or formatting
- Creating benchmark comparisons or thresholds
- Preparing values for presentation or downstream use in templates
Variables can be referenced in subsequent expressions, output columns, or even in other internal variables, allowing for modular and maintainable data logic.
Best Practices #
- Use descriptive names for internal variables to clarify their purpose.
- Select the appropriate scope based on whether your logic operates on individual rows or the entire dataset.
- Leverage automatic type assignment to reduce configuration errors.
- Avoid unsupported operations such as List, which are no longer available in the interface.
- Test variable output using Preview with Data to confirm accuracy before publishing the Data Block.
Additional Resources #
For more guidance on using internal variables effectively, refer to: