Skip to content

IGC Adapter

Integrating with Egeria

Integrating with Egeria

The IBM Information Governance Catalog connector is implemented as a repository proxy that runs through the Egeria OMAG Platform (server chassis). It does so by being implemented as an Open Metadata Repository Connector and implementing the repository-level methods defined by the Metadata Collection interface.

Integrating with IGC

Integrating with IGC

The connector communicates with an existing IBM Information Server environment through its REST API. Note that the connector only supports reading metadata from Information Server and is unable to create or update any metadata. Write operations can only be done by a user directly through IBM Information Server's supported web UIs or thick clients.

Capabilities

Read-only open metadata repository connector

Conformant to the mandatory CTS profile (metadata sharing)

including some additional search operations

Common metadata entities are pre-mapped

  • Database information (host, database, schema, table, column)
  • File information (host, folder, file, record, field)
  • Glossary information (category, term)

Common relationships are pre-mapped

  • Technical metadata containment (for the entities above)
  • Technical / business metadata relationships (ie. semantic assignment)

Some classifications are pre-defined

Although this is done through specific code and implementation choices, and may not be as-desired depending on the existing implementation of IGC in an organization.

Designed to be extensible

Without needing to fork the connector and modify code inline: extend with only what you need, re-using the base connector as-is (including any future enhancements, bug fixes, additions, etc).

Limitations

Does not handle any create, update or delete operations from the rest of the cohort

Does not provide an event mapper

No notification of metadata creation, changes or deletions via IGC's API or user interface are propagated out to the cohort: the connector only supports federated queries and retrieval through the metadata collection interface (APIs).

Implemented mappings

The following types are currently mapped from IGC to OMRS. Note that there are currently no mappings from OMRS types to IGC types as this connector is entirely read-only (not capable of adding metadata to IGC).

Hoping for a mapping that isn't there?

  • Submit an issue, or
  • Check out any of the linked code below for examples of what's needed to create a mapping, and create your own (and feel free to submit a PR with the result!)

Mapped entities

IGC type(s) OMRS type(s)
category Glossary1, GlossaryCategory
connector ConnectorType
data_class DataClass
data_connection Connection
data_file DataFile
data_file_field TabularColumn
data_file_folder FileFolder
data_file_record TabularSchemaType
database Database
database_column RelationalColumn
database_schema DeployedDatabaseSchema, RelationalDBSchemaType
database_table RelationalTable
host, host_(engine) Endpoint
information_asset NoteLog2
information_governance_policy GovernancePolicy
label InformalTag
note NoteEntry3
term GlossaryTerm
user, group ContactDetails, Team
user, steward_user, non_steward_user Person

Mapped relationships

IGC types and properties OMRS type
database_schema - database_schema AssetSchemaType4
data_file.data_file_records - data_file_record.data_file AssetSchemaType
information_asset - information_asset AttachedNoteLog5
information_asset.notes - note.belonging_to AttachedNoteLogEntry6
information_asset.labels - label.labeled_assets AttachedTag
data_file_record.data_file_fields - data_file_field.data_file_record AttributeForSchema
database_table.database_columns - database_column.database_table_or_view NestedSchemaAttribute
database_schema.database_tables - database_table.database_schema AttributeForSchema
category.subcategories - category.parent_category CategoryAnchor7
category.subcategories - category.parent_category CategoryHierarchyLink
data_connection.data_connectors - connector.data_connections ConnectionConnectorType
host.data_connections - connector.data_connections - data_connection.data_connectors ConnectionEndpoint8
data_connection.imports_database - database.data_connections ConnectionToAsset
data_connection - data_file_folder.data_connection ConnectionToAsset
user - user ContactThrough9
group - group ContactThrough9
information_asset.detected_classifications / information_asset.selected_classification - classification - data_class.classified_assets_detected / data_class.classifications_selected DataClassAssignment10
data_class.contains_data_classes - data_class.parent_data_class DataClassHierarchy
database.database_schemas - database_schema.database DataContentForDataSet
data_file_folder.data_file_folders - data_file_folder.parent_folder FolderHierarchy
database_column.defined_foreign_key_referenced / database_column.selected_foreign_key_referenced - database_column.defined_foreign_key_references / database_column.selected_foreign_key_references ForeignKey
information_governance_policy.subpolicies - information_governance_policy.parent_policy GovernancePolicyLink
data_file_folder.data_files - data_file.parent_folder NestedFile
term.related_terms - term.related_terms RelatedTerm
term.replaced_by - term.replaces ReplacementTerm
information_asset.assigned_to_terms - term.assigned_assets SemanticAssignment
term.synonyms - term.synonyms Synonym
category.terms - term.parent_category TermAnchor11
category.terms - term.parent_category TermCategorization
term.has_a_term - term.is_of TermHASARelationship
term.has_types - term.is_a_type_of TermISATypeOFRelationship
term.translations - term.translations Translation

Mapped classifications

Because IGC has no "Classification" concept, the following are suggested implementations of Classifications within IGC by overloading the use of other concepts. These can be changed to alternative implementations simply by updating the linked mapping code to match your desired implementation of the concept.

OMRS type IGC mapping logic
AssetZoneMembership The provided implementation simply assigns the list of default zones that have been specified as default.zones in the configuration of the connector to each Database, DataFile, DeployedDatabaseSchema and FileFolder in the IGC repository anytime these assets are retrieved.
Confidentiality The provided implementation assigns a Confidentiality classification to a GlossaryTerm (only) using the assigned_to_term relationship from one term to any term within the Classifications/Confidentiality parent category. The terms contained within this Confidentiality category in essence represent the ConfidentialityLevel enumeration in OMRS. With this implementation, any assigned_to_term relationship on a term, where the assigned term is within this Confidentiality category in IGC, will be mapped to a Confidentiality classification in OMRS.
PrimaryKey The provided implementation looks for the presence of either the defined_primary_key or selected_primary_key properties on a database_column, and if present add a PrimaryKey classification to that RelationalColumn.
SpineObject The provided implementation assigns a SpineObject classification to a GlossaryTerm based on the referencing_categories of the term in IGC. Specifically, when the term has a referencing_categories link to Classifications/SpineObject, the GlossaryTerm to which that term maps will be assigned the SpineObject classification.
SubjectArea The provided implementation assigns a SubjectArea classification to a GlossaryCategory based on the assigned_to_term relationship of the category in IGC. Specifically, when the category has an assigned_to_term relationship to the IGC term Classifications/SubjectArea, the GlossaryCategory will be assigned a SubjectArea classification whose name will match the name of the IGC category.
TypeEmbeddedAttribute The provided implementation adds this classification to all RelationTable, RelationalColumn and TabularColumn instances to simplify the management of type information without requiring additional SchemaType subclasses (like RelationalTableType, RelationalColumnType and TabularColumnType) and their additional relationships.

  1. All top-level categories in IGC that are not named Classifications are considered a Glossary, all categories whose parent_category is not null and not Classifications are considered a GlossaryCategory. 

  2. NoteLog is a generated entity, present only for those IGC objects that support notes, and it cannot be searched. 

  3. NoteEntry cannot be searched, only retrieved from the NoteLog to which they are related. 

  4. AssetSchemaType between database_schema objects is a generated relationship (all properties for both endpoints are on a single entity instance in IGC). 

  5. AttachedNoteLog between information_asset objects is a generated relationship (all properties for both endpoints are on a single entity instance in IGC), and is only present on those IGC objects that support notes. 

  6. AttachedNoteLogEntry relationship cannot be searched. 

  7. CategoryAnchor relationship is between the ultimate parent IGC category (Glossary) and any offspring. 

  8. ConnectionEndpoint is linked through IGC's connector object in the middle 

  9. ContactThrough is a generated relationship (all properties for both endpoints are on a single entity instance in IGC). 

  10. DataClassAssignment is linked through IGC's classification object in the middle, which has some relationship-specific properties. 

  11. TermAnchor creates a relationship between the ultimate parent IGC category (Glossary) and an IGC term.