Azure Power BI Scanner OCF Connector: Dataflows

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Applies from version 2023.3.3

Overview

Power BI dataflows, part of Power BI’s metadata, contain reusable transformation logic shared by multiple datasets and reports within Power BI. Power BI dataflows are automatically extracted from Power BI during metadata extraction, alongside other metadata, and represented as objects under the Power BI source.

Note

The Alation catalog object type dataflow, while similarly named, is distinct and separate from Power BI dataflows:

  • Power BI dataflows form a part of Power BI source’s metadata. To understand the mapping between Power BI metadata and Alation object types, refer to Power BI Object Hierarchy.

  • The Alation object type dataflow documents source and target objects for a lineage path. For more information about Alation’s dataflow object type, refer to Dataflow Objects.

Catalog Representation and Navigation

Power BI dataflows are represented in the catalog as BI datasource objects of type dataflow. Power BI dataflows use the BI datasource object catalog template.

All Power BI dataflows under a workspace can be located under the DataSources tab of the workspace catalog page.

../../../_images/OCF_PBI_Dataflow_All_Dataflows.png

Each Power BI dataflow has a dedicated catalog page, identifiable by its type in the Properties section. You can differentiate between a dataflow and a dataset by viewing the Type value.

If dataflow field metadata is available, it is represented as BI datasource column objects and can be located under the Fields tab.

../../../_images/OCF_PBI_Dataflow_Fields.png

Alation extracts both dataflows with direct database connections and dataflows derived from other dataflows. Direct database connections and derivation from other dataflows influence how a dataflow is presented in the catalog:

  • Directly connected dataflows display database connection details. Such a dataflow has fields derived from a database table or view and will have the connection information on the Connections tab.

    ../../../_images/OCF_PBI_Dataflow_Connections.png
  • Dataflows sourced from another dataflow lack direct database connections. Such dataflows list their source dataflows on the Overview tab in the DataSources table. The maximum supported depth is 32.

  • “Hybrid” dataflows combine direct database fields with those from another dataflow. On the catalog page, such dataflows will have both the connection information under the Connections tab and the source dataflow information under the Overview tab. For example, the screenshot below shows the Overview tab for a “hybrid” dataflow. The source dataflow can be located in the DataSources table. At the same time, the Connections tab shows that it has one connection to the database.

    ../../../_images/OCF_PBI_HybridDataflow_Datasource.png

    The next screenshot illustrates the Connections tab of the same dataflow displaying the database connection information.

    ../../../_images/OCF_PBI_HybridDataflow_Connections.png

Relationships to Datasets

Alation extracts dataflows with varying levels of relationships to datasets:

  • A Power BI dataset can be connected to a Power BI dataflow, with no direct connection to the database. In Alation, such dataflows are represented as data sources for a dataset. The dataflow will be accessible from the Overview tab of the corresponding dataset object under DataSources.

    ../../../_images/OCF_PBI_Dataflow_Child_Dataset.png
  • A Power BI dataset may include fields sourced from Power BI dataflows as well as direct connections to database tables or views, integrating data from both the datasource and dataflows. For these datasets, information on direct connections can be found under the Connections tab and also through the dataflow catalog page.

    For example, the screenshot below shows the Overview tab of the catalog page of a Power BI dataset object. You can identify it by looking at the value of the Type field under Properties on the right. The dataset lists the source dataflow information in the DataSources table. At the same time, the Connections tab shows that one direct database connection is also available.

    ../../../_images/OCF_PBI_Dataflow_Hybrid_Dataset.png

View Power BI Dataflows on Lineage Diagrams

Downstream Lineage for Power BI Dataflows

Lineage diagrams on the Lineage tab show downstream lineage pathways from dataflows to datasets, reports, and dashboards, provided these elements are extracted.

For example, the screenshot below shows a Power BI dataflow object’s catalog page. The Lineage tab reveals the downstream paths (2) leading to datasets and analytical objects from the dataflow object (1,3).

../../../_images/OCF_PBI_Dataflow_Downstream.png

Upstream Lineage for Power BI Dataflows

To trace the upstream lineage from a Power BI dataflow to tables and views in relevant data sources, you’ll need to configure cross-data-source lineage for the data sources. Find more information about this configuration in BI Connection Info.

For example, the screenshot below illustrates upstream lineage from a Power BI dataflow (1) to tables within a MySQL data source (2), helping you discover the source of the upstream data.

../../../_images/OCF_PBI_Dataflow_Upstream.png

Power BI Dataflows in Lineage Analysis Reports

Similar to other objects on lineage diagrams, you can view lineage analysis reports for Power BI dataflows. To view the lineage analysis reports, select a Power BI dataflow on a lineage diagram and click View Impact Analysis or View Upstream Audit on the top right of the diagram. Use the filtering capabilities of the lineage analysis reports to view the downstream impact of the dataflow or its upstream origin.

../../../_images/OCF_PBI_Dataflow_LineageImpact.png