Amazon EMR Presto Connector: Overview

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Available from Alation version 2022.2

Overview

The latest OCF connector for Amazon EMR Presto can be downloaded from the Connector Hub on Alation Customer Portal. Ask an Alation admin with access to Customer Portal to download the connector from the Connectors section (Customer Portal > Connectors).

The connector file can be uploaded and installed in the Alation application. The connector is compiled together with the required database driver, so no additional effort is needed to procure and install the driver.

This connector should be used to catalog Amazon EMR Presto as a data source on Alation on-premise and Cloud Service instances. It extracts and catalogs such database objects as schemas, tables, views, and columns. After the metadata is extracted, it is represented in the data catalog as a hierarchy of catalog pages under the parent data source. Alation users can leverage the full catalog functionality to search for and find the extracted metadata, curate the corresponding catalog pages, create documentation about the data source, and exchange information about it.

Team

You may need the assistance of the following administrators to configure this connector:

  • Amazon EMR Presto administrator:

    • Provides the connection information and the JDBC URI.

    • Provides the authentication information and assists in configuring the authentication.

    • Provides the SSL certificate.

    • Assists in configuring Kerberos authentication.

    • Provides access to the schema for metadata extraction.

  • Alation Server administrator:

    • Ensures that Alation Connector Manager is installed and running or installs it.

    • Installs the connector.

    • Creates and configures the Amazon EMR Presto data source in the catalog.

    • Performs initial extraction and prepares the data source for Alation users.

Scope

The table below describes which metadata objects are extracted by the connector and which catalog functionality is supported.

Feature

Scope

Availability

Authentication

Basic

Authentication with username, password, and security token

Yes

AWS IAM user

Authentication with an IAM user access key and secret

No

AWS IAM role

Authentication with an STS token for an AWS IAM role

No

SSL

Connection over the TLS protocol

Yes

Kerberos

Authentication with an IAM user access key and secret

Yes

Keytab

Support for Kerberos with keytabs

Yes

LDAP

Authentication with a database service account that is an LDAP account in an organization’s network

Yes

Metadata Extraction (MDE)

Default MDE

Extraction of metadata based on the JDBC driver methods in the connector code

Yes

Custom query-based MDE

Extraction of metadata based on extraction queries provided by a user

No

Extracted metadata objects

Data Source

Data source object in Alation that is parent to extracted metadata

Yes

Schemas

List of databases

Yes

Tables

List of tables

Yes

Columns

List of columns

Yes

Column data types

Column data types

Yes

Views

List of views

Yes

Source comments

Source comments

No

Primary keys

Primary key information for extracted tables

No

Foreign keys

Foreign key information for extracted tables

No

Functions

Extraction of function metadata

No

Function definitions

Extraction of function definition metadata

No

Sampling and Profiling

Table sampling

Retrieval of data samples from extracted tables

Yes

Column sampling

Retrieval of data samples from extracted columns

Yes

Deep column profiling

On-demand profiling of specific columns with the calculation of value distribution stats

Yes

Dynamic profiling

On-demand table and column profiling by individual users who use their own database accounts to retrieve the profiles

Yes

Custom query-based table sampling

Ability to use custom queries for sampling specific tables

Yes

Custom query-based column profiling

Ability to use custom queries for profiling specific columns

Yes

Query Log Ingestion (QLI)

Table-based QLI

Ingestion of query history based on a table that contains query history data

No

Query-based QLI

Ingestion of query history based on a custom query history extraction query

Yes

JOINs and filters

Calculation of JOIN and filter information based on ingested query history

Yes

Predicates

Ability to parse predicates in ingested queries

Yes

Lineage

Automatic lineage generation

Auto-calculation of lineage based on query history ingested from QLI, MDE, and Compose queries

Yes

Direct lineage

Extraction of lineage from system tables during MDE

No

Column-level lineage

Extraction of lineages on the column level

No

Compose

On-premise instances

Availability of Compose on on-premise instances of Alation

Yes

Alation Cloud Service instances

Depending on your network configuration, you may be using Alation Agent to connect to your data source

Yes

Basic authentication in Compose

Authentication in Compose with username and password

Yes

SSO authentication in Compose

Authentication in Compose with SSO credentials

No