Assette’s powerful data capabilities allow users to connect to a wide range of source systems to retrieve and transform data for reporting, client communications, and more. Whether your organization uses a relational database such as Snowflake, a NoSQL solution like MongoDB, an API-based data provider, or plain-old Excel, Assette provides a structured approach to connect to external systems.
This guide outlines the fundamental steps involved in connecting to source data from any system and making it available within Assette. Additionally, Assette stores as little data as possible to ensure that your organization’s data is always up-to-date and ensures that you are always in control of the data.
Understanding the Data Extraction Process #
At its core, data extraction in Assette follows a structured pipeline:
- Establish a secure connection to the source system using appropriate authentication mechanisms.
- Configure connection settings to define the source location, credentials, and any required parameters.
- Define the query or extraction logic that retrieves the required data from the source system.
- Map fields Assette aims to minimize mapping as much as possible.
- Set up parameters and dependencies to ensure dynamic and efficient data retrieval.
- Validate and preview the data to ensure accuracy before making it available for reporting.
Regardless of the technology used for data storage, these fundamental principles apply to all source system integrations.
1. Establishing a Secure Connection #
Before extracting data, users must establish a secure connection between Assette and the source system. The type of connection depends on the system in use. For databases like Snowflake or SQL Server, this typically involves authentication credentials, such as username/password or certificate-based authentication. If the data resides in a NoSQL database like MongoDB, the connection may require API tokens or private keys.
Assette supports various authentication mechanisms, allowing organizations to securely retrieve data while complying with internal security policies. To streamline authentication, credentials can be stored in a secure Certificate Data Block, ensuring that they are consistently referenced across multiple data extractions without exposing sensitive information.
2. Configuring Source System Settings #
Once authentication is in place, users must define connection settings for the source system. These settings typically include:
- Database Name or API Endpoint – Specifies where the data resides.
- Schema or Collection Details – Defines the specific database schema, table, or collection that will be accessed.
- Connection Credentials – Securely stored authentication details required to access the data.
- Query Execution Environment – Specifies parameters such as processing warehouses (for Snowflake) or query timeout settings.
By centralizing these settings in a Settings Data Block, users can ensure consistency across multiple data extraction processes.
3. Defining the Data Extraction Logic #
Once the connection is established, the next step is to define how data will be retrieved. The extraction logic depends on the source system. To simplify the data extraction process, all Data Blocks use Python definitions, regardless of the language or structure of the target data source. For example, if we need to extract performance related data from a SQL-based databases such as Snowflake, we could write this in Python as:
{
"type": "table",
"sql": "SELECT
PORTFOLIOCODE as \"accountcode\",
HISTORYDATE as \"historydate\",
CURRENCYCODE as \"currencycode\",
CURRENCY as \"currency\",
PERFORMANCECATEGORY as \"performancecategory\",
PERFORMANCECATEGORYNAME as \"performancecategoryname\",
PERFORMANCETYPE as \"performancetype\",
PERFORMANCEINCEPTIONDATE as \"performanceinceptiondate\",
PORTFOLIOINCEPTIONDATE as \"accountinceptiondate\",
PERFORMANCEFREQUENCY as \"performancefrequency\",
PERFORMANCEFACTOR as \"performancefactor\"
FROM PORTFOLIOPERFORMANCE WHERE HISTORYDATE=?",
"parameters": ["AsofDate"]
}
4. Setting Up Parameters and Dependencies #
To make the data extraction dynamic, organizations often need to introduce parameters that allow users to filter data based on different conditions. A common example is the “As of Date” parameter, which retrieves data for a specific historical date.
5. Save and Publish the Data Block #
Once the extraction logic and dependencies are in place, users should preview the data to ensure correctness. This involves running a test query and reviewing sample data to confirm that:
- The extracted records match expectations.
- Field mappings correctly align with Assette’s reporting structure.
- Filters and parameters function as intended.
After successful validation, the Data Block can be saved and published, making it available for integration with Data Objects, reports, and client presentations.