Skip to content

Developer Guide

This guide supports developers wishing to customize Egeria to run in additional environments, exchange metadata with additional third party technologies and/or augment existing tools and utilities.

It is organized as follows:

Working with the platform APIs

Using the clients

The Egeria clients wrap calls to Egeria's REST APIs and topics. The aim is to provide a language-specific interface that manages the marshalling and de-marshalling of the call parameters and responses to these services.

Using the REST APIs

Egeria supports REST APIs for making synchronous (request-response) calls between OMAG Servers and between clients and OMAG Servers.

REST APIs are intended for internal use

The REST APIs are usable directly for calling from non-Java platforms; however, they are designed for the internal use of Egeria and are not guaranteed to be backwards compatible.

The structure of the URL for an Egeria REST API varies lightly depending on whether it is a call to an OMAG Server Platform service or an OMAG Server service.

Using connectors

Connectors can be created through the following clients:

Example: connecting to CSV files using Asset Consumer OMAS

The code sample below uses the Asset Consumer OMAS client to retrieve a list of assets from a metadata access server and then create a connector to each one using the getConnectorToAsset() method.

This method assumes that there is a connection object with a connector type and endpoint linked to the requested asset in the metadata repository.

An asset with a connection

An exception is thrown if an asset does not have a connection.

In the sample, the connector returned by the Asset Consumer OMAS client is then cast to the CSVFileConnector. Assets that are not CSV files will have a different connector implementation and so the casting to CSVFileConnector also results in an exception.

Assets that do not have a CSVFileConnector are ignored. The result is that the sample method returns a connector for the first CSV file asset retrieved from the metadata repository.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
/**
 * This method uses Asset Consumer OMAS to locate and create an Open Connector Framework (OCF) connector
 * instance.
 *
 * @return connector to first CSVFile located in the catalog
 */
private CSVFileStoreConnector getConnectorUsingMetadata()
{
    try
    {
        /*
         * The Asset Consumer OMAS supports a REST API to extract metadata from the open metadata repositories
         * linked to the same open metadata cohort as the Asset Consumer OMAS.  It also has a Java client that
         * provides an equivalent interface to the REST API plus connector factory methods supported by an
         * embedded Connector Broker.  The Connector Broker is an Open Connector Framework (OCF) component
         * that is able to create and configure instances of compliant connectors.  It is passed a Connection
         * object which has all of the properties needed to create the connector.  The Asset Consumer OMAS
         * extracts the Connection object from the open metadata repositories and then calls the Connector Broker.
         */
        AssetConsumer client = new AssetConsumer(serverName, serverURLRoot);

        /*
         * This call extracts the list of assets stored in the open metadata repositories that have a name
         * that matches the requested filename.
         */
        List<String>   knownAssets = client.findAssets(clientUserId, ".*", 0, 4);

        if (knownAssets != null)
        {
            System.out.println("The open metadata repositories have returned " + knownAssets.size() + " asset definitions for the requested file name " + fileName);

            for (String assetGUID : knownAssets)
            {
                if (assetGUID != null)
                {
                    try
                    {
                        /*
                         * The aim is to return a connector for the first matching asset.  If an asset of a different
                         * type is returned, on one where it is not possible to create a connector for, then an
                         * exception is thrown and the code moves on to process the next asset.
                         */
                        return (CSVFileStoreConnector) client.getConnectorForAsset(clientUserId, assetGUID);
                    }
                    catch (Exception error)
                    {
                        System.out.println("Unable to create connector for asset: " + assetGUID);
                    }
                }
            }
        }
        else
        {
            System.out.println("The open metadata repositories do not have an asset definition for the requested file name " + fileName);
        }
    }
    catch (Exception error)
    {
        System.out.println("The connector can not be created from metadata.  Error message is: " + error.getMessage());
    }

    return null;
}

Connecting to assets with different levels of security

It is possible that an asset can have multiple connections, each with different levels of security access encoded. Egeria is able to determine which one to use by calling the validateUserForAssetConnectionList() method of the Server Security Metadata Connector.

Multiple connections for an asset

Open metadata is a connected network (graph) of information. The connector type and endpoint that a connection object links to are typically shared with many connections. This creates some interesting insight.

For example, there is typically one connector type for each connector implementation. By retrieving the relationships from the connector type to the connections, it is possible to see the extent to which the connector is used.

Connector Types

Uses of a connector implementation

The connector types for Egeria's data store connectors are available in an open metadata archive called DataStoreConnectorTypes.json that can be loaded into the server. This approach can be used for all of your connector implementations to create the connector type objects in our metadata repository. See the open-connector-archives for more detail.

Connector categories

By default, connector implementations are assume to support the OCF. However, many vendor platforms have their own connector frameworks. The ConnectorCategory allows equivalent connector types from different connector frameworks to be gathered together so that the connector type from a connection can be swapped for an equivalent connector type for the locally supported connector framework.

Connector Categories

Endpoints

The endpoints are typically linked to the software server that is called by the connector. By navigating from the Endpoint to the linked connections it is possible to trace the callers to the software server.

Connections to a software server

Software servers and endpoints are set up through the IT Infrastructure OMAS.

Further information

The connector catalog lists the connectors provided by the Egeria community.

Building connectors

Connectors are plug-in Java clients that either perform an additional service, or, more typically, enable Egeria to integrate with a third party technology.

The concept of a connector comes from the Open Connector Framework (OCF). The OCF provides a common framework for components that enable one technology to call another, arbitrary technology through a common interface. The implementation of the connector is dynamically loaded based on the connector's configuration.

Configuration

The configuration for a connector is managed in a connection object.

A connection contains properties about the specific use of the connector, such as user Id and password, or parameters that control the scope or resources that should be made available to the connector. It links to an optional endpoint and a mandatory connector type object.

  • ConnectorType describes the type of the connector, its supported configuration properties and its factory object (called the connector's provider). This information is used to create an instance of the connector at runtime.
  • Endpoint describes the server endpoint where the third party data source or service is accessed from.

Connector types and endpoints can be reused in multiple connections.

Structure of a connection object

Factories

Each connector implementation has a factory object called a connector provider. The connector provider has two types of methods:

  • Return a new instance of the connector based on the properties in a supplied Connection object. The Connection object has all the properties needed to create and configure the instance of the connector.
  • Return additional information about the connector's behavior and usage to make it easier to consume. For example, the standard base class for a connector provider has a method to return the ConnectorType object for this connector implementation that can be added to a Connection object used to hold the properties needed to create an instance of the connector.

Lifecycle of the connector

Each connector has its own unique implementation that is structured around a simple lifecycle that is defined by the OCF. The OCF provides the interface for a connector called Connector that has three methods: initialize, start and disconnect.

This connector interface supports the basic lifecycle of a connector. There are three phases:

  1. Initialization - During this phase, the connector is passed the context in which it is to operate. It should store this information.

    This phase is initiated by a call to the connector's initialize() method, which is called after the connector's constructor and provides the connector with a unique instance identifier (for logging) and its configuration stored in a connection. After initialize() returns, there may be other calls to pass context to the connector. For example, if the connector implements the AuditLoggingComponent , an audit log is passed to the connector.

  2. Running - The connector is completely initialized with its context, and it can start processing.

    This phase is initiated by a call to the connector's start() method. At this point it should create its client to any third party technology and begin processing. It may also start up threads if it needs to perform any background processing (such as listening for notifications). If the connector throws an exception during start, Egeria knows the connector has a configuration or operational issue and will report the error and move it to disconnected state.

  3. Disconnected - The connector must stop processing and release all of its resources.

    This phase is initiated by a call to the connector's disconnect() method.

Depending on the type of connector you are writing, there may be additional initialization calls occurring between the initialize() and the start() method. The connector may also support additional methods for its normal operation that can be called between the start() and disconnect() calls.

The OCF also provides the base class for a connector called ConnectorBase . The ConnectorBase base class manages the lifecycle state of the connector. For example, the default implementation of initialize() in the ConnectorBase class stores the supplied unique instance identifier and connection values in protected variables called connectorInstanceId and connectionProperties respectively.

Call the base class's methods in any overrides

If you override any of the initialize(), start() or disconnect() methods, be sure to call super.xxx() at the start of your implementation to call the appropriate super class method so that the state is properly maintained.

To write an open connector you need to complete four steps:

  1. Identify the properties for the connection.
  2. Write the connector provider.
  3. Understand the interface that the connector needs to implement and the support provided by its base class.
  4. Write the connector itself.

All the code you write to implement these should exist in its own module, and as illustrated by the examples could even be in its own independent code repository. Their implementation will have dependencies on Egeria's:

No dependency on Egeria's OMAG Server Platform

Note that there is no dependency on Egeria's OMAG Server Platform for these specific connector implementations: they could run in another runtime that supported the connector APIs. In fact, even the Egeria interface modules should not be embedded in your jar file to allow your connector to run on any version of the OMAG Server Platform that supports your connector.

Identify connection properties

Begin by identifying and designing the properties needed to connect to your tool or system. These will commonly include a network address, protocol, and user credentials, but could also include other information that can be stored in the configuration properties of the connection.

Code the connector provider

Example: connector provider for IBM DataStage

For example, the DataStageConnectorProvider is used to instantiate connectors to IBM DataStage data processing engines. Therefore, its name and description refer to DataStage, and the connectors it instantiates are DataStageConnectors .

Example: connector provider for IBM Information Governance Catalog

Similarly, the IGCOMRSRepositoryConnectorProvider is used to instantiate connectors to IBM Information Governance Catalog (IGC) metadata repositories. In contrast to the DataStageConnectorProvider, the IGCOMRSRepositoryConnectorProvider's name and description refer to IGC, and the connectors it instantiates are IGCOMRSRepositoryConnectors .

Connectors implement Egeria interfaces, not vice versa

Note that the code of all of these connector implementations exists outside Egeria itself (in separate code repositories), and there are no direct dependencies within Egeria on these external repositories or connectors.

All connectors can be configured with the network address and credential information needed to access the underlying tool or system. Therefore, you do not need to explicitly list properties for such basic details. However, the names of any additional configuration properties that may be useful to a specific type of connector can be described through the recognizedConfigurationProperties of the connector type.

Implementation pattern

From the two examples (DataStageConnectorProvider and IGCOMRSRepositoryConnectorProvider ), you will see that writing a connector provider follows a simple pattern:

  • Extend a connector provider base class specific to your connector's interface.
  • Define static final class members for the GUID, name, description and the names of any additional configuration properties.
  • Write a single public constructor, with no parameters, that:
    • Calls super.setConnectorClassName() with the name of your connector class.
    • Creates a new ConnectorType object, sets it characteristics to the static final class members, and uses .setConnectorProviderClassName() to set the name of the connector provider class itself.
    • (Optional) Creates a list of additional configuration properties from the static final class members, and uses .setRecognizedConfigurationProperties() to add these to the connector type.
    • Sets super.connectorTypeBean = connectorType.

Understand the connector interface

Now that you have the connector provider to instantiate your connector, you need to understand what your connector actually needs to do. For a service to use your connector, the connector must provide a set of methods that are relevant to that service.

Example: data engine proxy connector interface

For example, the data engine proxy services integrate metadata from data engines with Egeria. To integrate DataStage with Egeria, we want our DataStageConnector to be used by the data engine proxy services. Therefore, the connector needs to extend DataEngineConnectorBase , because this defines the methods needed by the data engine proxy services.

Example: OMRS repository connector interface

Likewise, we want our IGCOMRSRepositoryConnector to integrate IGC with Egeria as a metadata repository. Therefore, the connector needs to extend OMRSRepositoryConnector , because this defines the methods needed to integrate with Open Metadata Repository Services (OMRS).

How would you know to extend these base classes? The connector provider implementations in the previous step each extended a base class specific to the type of connector they provide (DataEngineConnectorProviderBase and OMRSRepositoryConnectorProviderBase ). These connector base classes (DataEngineConnectorBase and OMRSRepositoryConnector ) are in the same package structure as those connector provider base classes.

In both cases, by extending the abstract classes (DataEngineConnectorBase and OMRSRepositoryConnector ) your connector must implement the methods these abstract classes define. These general methods implement your services (data engine proxy services and OMRS), without needing to know anything about the underlying technology. Therefore, you can simply "plug-in" the underlying technology: any technology with a connector that implements these methods can run your service. Furthermore, each technology-specific connector can decide how best to implement those methods for itself.

Code the connector itself

Which brings you to writing the connector itself. Now that you understand the interface your connector must provide, you need to implement the methods defined by that interface.

Implement the connector by:

  1. Retrieving connection information provided by the configuration. The default method for initialize() saves the connection object used to create the connector. If your connector needs to override the initialize() method, it should call super.initialize() to capture the connection properties for the base classes.
  2. Implementing the start() method, where the main logic for your connector runs. Use the configuration details from the connection object to connect to your underlying technology. If the connector is long-running, this may be the time to start up a separate thread. However, this has to conform the rules laid down for the category of connector you are implementing.
  3. Using pre-existing, technology-specific clients and APIs to talk to your underlying technology.
  4. Translating the underlying technology's representation of information into the open metadata representation used by the connector interface itself.

For the first point, you can retrieve general connection information like:

  • the server address and protocol, by first retrieving the embedded EndpointProperties with getEndpoint():
    • retrieving the protocol by calling getProtocol() on the EndpointProperties
    • retrieving the address by calling getAddress() on the EndpointProperties
  • the user Id, by calling getUserId() on the ConnectionProperties
  • the password, by calling either getClearPassword() or getEncryptedPassword() on the ConnectionProperties, depending on what your underlying technology can handle

Use these details to connect to and authenticate against your underlying technology, even when it is running on a different system from the connector itself. Of course, check for null objects (like the EndpointProperties) as well before blindly operating on them.

Retrieve additional properties by:

  • calling getConfigurationProperties() on the ConnectionProperties, which returns a Map<String, Object>
  • calling get(name) against that Map<> with the name of each additional property of interest

Implementation of the remaining points (2-3) will vary widely depending on the specific technology being used. See the examples previously linked to delve deeper.

Configuration

The configuration for a connector is managed in a connection object.

A connection contains properties about the specific use of the connector, such as userId and password, or parameters that control the scope or resources that should be made available to the connector. It links to an optional endpoint and a mandatory connector type object.

  • ConnectorType describes the type of the connector, its supported configuration properties and its factory object (called the connector's provider). This information is used to create an instance of the connector at runtime.
  • Endpoint describes the server endpoint where the third party data source or service is accessed from.

Connector types and endpoints can be reused in multiple connections.

Structure of a connection object

Factories

Each connector implementation has a factory object called a connector provider. The connector provider has two types of methods:

  • Return a new instance of the connector based on the properties in a supplied Connection object. The Connection object has all the properties needed to create and configure the instance of the connector.
  • Return additional information about the connector's behavior and usage to make it easier to consume. For example, the standard base class for a connector provider has a method to return the ConnectorType object for this connector implementation that can be added to a Connection object used to hold the properties needed to create an instance of the connector.

Inside the connector

Each connector has its own unique implementation that is structured around a simple lifecycle that is defined by the OCF. The OCF provides the interface for a connector called Connector that has three methods: initialize, start and disconnect.

This connector interface supports the basic lifecycle of a connector. There are three phases:

  1. Initialization - During this phase, the connector is passed the context in which it is to operate. It should store this information.

    This phase is initiated by a call to the connector's initialize() method, which is called after the connector's constructor and provides the connector with a unique instance identifier (for logging) and its configuration stored in a connection. After initialize() returns, there may be other calls to pass context to the connector. For example, if the connector implements the AuditLoggingComponent , an audit log is passed to the connector.

  2. Running - The connector is completely initialized with its context, and it can start processing.

    This phase is initiated by a call to the connector's start() method. At this point it should create its client to any third party technology and begin processing. It may also start up threads if it needs to perform any background processing (such as listening for notifications). If the connector throws an exception during start, Egeria knows the connector has a configuration or operational issue and will report the error and move it to disconnected state.

  3. Disconnected - The connector must stop processing and release all of its resources.

    This phase is initiated by a call to the connector's disconnect() method.

Depending on the type of connector you are writing, there may be additional initialization calls occurring between the initialize() and the start() method. The connector may also support additional methods for its normal operation that can be called between the start() and disconnect() calls.

The OCF also provides the base class for a connector called ConnectorBase . The ConnectorBase base class manages the lifecycle state of the connector. For example, the default implementation of initialize() in the ConnectorBase class stores the supplied unique instance identifier and connection values in protected variables called connectorInstanceId and connectionProperties respectively.

Call the base class's methods in any overrides

If you override any of the initialize(), start() or disconnect() methods, be sure to call super.xxx() at the start of your implementation to call the appropriate super class method so that the state is properly maintained.

Extending Egeria using connectors

Egeria has extended the basic concept of the OCF connector and created specialized connectors for different purposes. The following types of connectors are supported by the Egeria subsystems with links to the documentation and implementation examples.

Type of Connector Description Documentation Implementation Examples
Integration Connector Implements metadata exchange with third party tools. Building Integration Connectors integration-connectors
Open Discovery Service Implements automated metadata discovery. Open Discovery Services discovery-service-connectors
Governance Action Service Implements automated governance. Governance Action Services governance-action-connectors
Configuration Document Store Persists the configuration document for an OMAG Server. Configuration Document Store Connectors configuration-store-connectors
Platform Security Connector Manages service authorization for the OMAG Server Platform. Metadata Security Connectors open-metadata-security-samples
Server Security Connector Manages service and metadata instance authorization for an OMAG Server. Metadata Security Connectors open-metadata-security-samples
Metadata Collection (repository) Store Interfaces with a metadata repository API for retrieving and storing metadata. OMRS Repository Connectors open-metadata-collection-store-connectors
Metadata Collection (repository) Event Mapper Maps events from a third party metadata repository to open metadata events. OMRS Event Mappers none
Open Metadata Archive Store Reads an open metadata archive from a particular type of store. OMRS Open Metadata Archive Store Connector open-metadata-archive-connectors
Audit Log Store Audit logging destination OMRS Audit Log Store Connector audit-log-connectors
Cohort Registry Store Local store of membership of an open metadata repository cohort. OMRS Cohort Registry Store cohort-registry-store-connectors
Open Metadata Topic Connector Connects to a topic on an external event bus such as Apache Kafka. Open Metadata Topic Connectors open-metadata- topic-connectors

You can write your own connectors to integrate additional types of technology or extend the capabilities of Egeria - and if you think your connector is more generally useful, you could consider contributing it to the Egeria project.

Building open metadata archives

Working with the open metadata and governance APIs

Adding registered services

Registered services are optional services that plug into Egeria's OMAG Server Platform. There are 4 types:

There are many choices of registered services within the Egeria project. However, you may add your own. The recommended modules for registered services (required if it is to be contributed to the Egeria project) are shown in the table below:

Module naming Description OMAS OMES OMIS OMVS
moduleName-api Client java interface(s), property beans and rest beans. CP CP CP P
moduleName-client Java client implementation. CP C C N
moduleName-topic-connectors Java connectors for sending and receiving events. OCP N N N
moduleName-server Server-side REST and event management implementation. P P P P
moduleName-spring Server-side REST API. P P P P

Key:

  • CP - Required and runs in external clients plus in the OMAG Server Platform.
  • C - Required and runs in external clients.
  • P - Required and runs in the OMAG Server Platform.
  • OCP - Optional and when provided runs in external clients plus in the OMAG Server Platform.
  • N - Not implemented/needed.

The modules for each registered service that need to run in the OMAG Server Platform are delivered in their own jar that is available to the OMAG Server Platform via the CLASSPATH. Inside the registered service's spring jar are one or more REST APIs implemented using Spring Annotations. On start up, the OMAG Server Platform issues a Component Scan to gather details of its REST APIs. This process loads the spring module which in turn loads the server and api modules of registered services it finds and they are initialized as part of the platform's capabilities and are callable via the platform's root URL and port. The client module of an OMAS is loaded by an OMES, OMIS or OMVS registered service that is dependent on the OMAS to get access to open metadata.

The best guide for building registered services are the existing implementations found in egeria.git. You can see the way the code is organized and the services that they depend on.

Summary

Egeria is designed to simplify the effort necessary to integrate different technologies so that they can actively share and consume metadata from each other.

It focuses on providing five types of integration interfaces.

  • Connectors that translate between third party APIs and open metadata APIs. These connectors are hosted in the Egeria servers and support the active exchange of metadata with these technologies.
  • Connectors for accessing popular type of data sources that also retrieve open metadata about the data source. This allows applications and tools to understand the structure, meaning, profile, quality and lineage of the data they are using.
  • Java clients for applications to call the Open Metadata Access Service (OMAS) interfaces, each of which are crafted for particular types of technology. These interfaces support both synchronous APIs, inbound event notifications and outbound asynchronous events.
  • REST APIs for the Egeria Services. These include the access services, admin services and platform services.
  • Kafka topics with JSON payloads for asynchronous communication (both in and out) with the open metadata ecosystem. Learn more ...
Back to top