IGC Adapter¶
Integrating with Egeria¶
The IBM Information Governance Catalog connector is implemented as a repository proxy that runs through the Egeria OMAG Platform (server chassis). It does so by being implemented as an Open Metadata Repository Connector and implementing the repository-level methods defined by the Metadata Collection interface.
Integrating with IGC¶
The connector communicates with an existing IBM Information Server environment through its REST API. Note that the connector only supports reading metadata from Information Server and is unable to create or update any metadata. Write operations can only be done by a user directly through IBM Information Server's supported web UIs or thick clients.
Capabilities¶
Read-only open metadata repository connector
Conformant to the mandatory CTS profile (metadata sharing)
including some additional search operations
Common metadata entities are pre-mapped
- Database information (host, database, schema, table, column)
- File information (host, folder, file, record, field)
- Glossary information (category, term)
Common relationships are pre-mapped
- Technical metadata containment (for the entities above)
- Technical / business metadata relationships (ie. semantic assignment)
Some classifications are pre-defined
Although this is done through specific code and implementation choices, and may not be as-desired depending on the existing implementation of IGC in an organization.
Designed to be extensible
Without needing to fork the connector and modify code inline: extend with only what you need, re-using the base connector as-is (including any future enhancements, bug fixes, additions, etc).
Limitations¶
Does not handle any create, update or delete operations from the rest of the cohort
Does not provide an event mapper
No notification of metadata creation, changes or deletions via IGC's API or user interface are propagated out to the cohort: the connector only supports federated queries and retrieval through the metadata collection interface (APIs).
Implemented mappings¶
The following types are currently mapped from IGC to OMRS. Note that there are currently no mappings from OMRS types to IGC types as this connector is entirely read-only (not capable of adding metadata to IGC).
Hoping for a mapping that isn't there?
- Submit an issue, or
- Check out any of the linked code below for examples of what's needed to create a mapping, and create your own (and feel free to submit a PR with the result!)
Mapped entities¶
IGC type(s) | OMRS type(s) |
---|---|
category |
Glossary1, GlossaryCategory |
connector |
ConnectorType |
data_class |
DataClass |
data_connection |
Connection |
data_file |
DataFile |
data_file_field |
TabularColumn |
data_file_folder |
FileFolder |
data_file_record |
TabularSchemaType |
database |
Database |
database_column |
RelationalColumn |
database_schema |
DeployedDatabaseSchema, RelationalDBSchemaType |
database_table |
RelationalTable |
host , host_(engine) |
Endpoint |
information_asset |
NoteLog2 |
information_governance_policy |
GovernancePolicy |
label |
InformalTag |
note |
NoteEntry3 |
term |
GlossaryTerm |
user , group |
ContactDetails, Team |
user , steward_user , non_steward_user |
Person |
Mapped relationships¶
IGC types and properties | OMRS type |
---|---|
database_schema - database_schema |
AssetSchemaType4 |
data_file.data_file_records - data_file_record.data_file |
AssetSchemaType |
information_asset - information_asset |
AttachedNoteLog5 |
information_asset.notes - note.belonging_to |
AttachedNoteLogEntry6 |
information_asset.labels - label.labeled_assets |
AttachedTag |
data_file_record.data_file_fields - data_file_field.data_file_record |
AttributeForSchema |
database_table.database_columns - database_column.database_table_or_view |
NestedSchemaAttribute |
database_schema.database_tables - database_table.database_schema |
AttributeForSchema |
category.subcategories - category.parent_category |
CategoryAnchor7 |
category.subcategories - category.parent_category |
CategoryHierarchyLink |
data_connection.data_connectors - connector.data_connections |
ConnectionConnectorType |
host.data_connections - connector.data_connections - data_connection.data_connectors |
ConnectionEndpoint8 |
data_connection.imports_database - database.data_connections |
ConnectionToAsset |
data_connection - data_file_folder.data_connection |
ConnectionToAsset |
user - user |
ContactThrough9 |
group - group |
ContactThrough9 |
information_asset.detected_classifications / information_asset.selected_classification - classification - data_class.classified_assets_detected / data_class.classifications_selected |
DataClassAssignment10 |
data_class.contains_data_classes - data_class.parent_data_class |
DataClassHierarchy |
database.database_schemas - database_schema.database |
DataContentForDataSet |
data_file_folder.data_file_folders - data_file_folder.parent_folder |
FolderHierarchy |
database_column.defined_foreign_key_referenced / database_column.selected_foreign_key_referenced - database_column.defined_foreign_key_references / database_column.selected_foreign_key_references |
ForeignKey |
information_governance_policy.subpolicies - information_governance_policy.parent_policy |
GovernancePolicyLink |
data_file_folder.data_files - data_file.parent_folder |
NestedFile |
term.related_terms - term.related_terms |
RelatedTerm |
term.replaced_by - term.replaces |
ReplacementTerm |
information_asset.assigned_to_terms - term.assigned_assets |
SemanticAssignment |
term.synonyms - term.synonyms |
Synonym |
category.terms - term.parent_category |
TermAnchor11 |
category.terms - term.parent_category |
TermCategorization |
term.has_a_term - term.is_of |
TermHASARelationship |
term.has_types - term.is_a_type_of |
TermISATypeOFRelationship |
term.translations - term.translations |
Translation |
Mapped classifications¶
Because IGC has no "Classification" concept, the following are suggested implementations of Classifications within IGC by overloading the use of other concepts. These can be changed to alternative implementations simply by updating the linked mapping code to match your desired implementation of the concept.
OMRS type | IGC mapping logic |
---|---|
AssetZoneMembership | The provided implementation simply assigns the list of default zones that have been specified as default.zones in the configuration of the connector to each Database , DataFile , DeployedDatabaseSchema and FileFolder in the IGC repository anytime these assets are retrieved. |
Confidentiality | The provided implementation assigns a Confidentiality classification to a GlossaryTerm (only) using the assigned_to_term relationship from one term to any term within the Classifications/Confidentiality parent category . The terms contained within this Confidentiality category in essence represent the ConfidentialityLevel enumeration in OMRS. With this implementation, any assigned_to_term relationship on a term , where the assigned term is within this Confidentiality category in IGC, will be mapped to a Confidentiality classification in OMRS. |
PrimaryKey | The provided implementation looks for the presence of either the defined_primary_key or selected_primary_key properties on a database_column , and if present add a PrimaryKey classification to that RelationalColumn . |
SpineObject | The provided implementation assigns a SpineObject classification to a GlossaryTerm based on the referencing_categories of the term in IGC. Specifically, when the term has a referencing_categories link to Classifications/SpineObject , the GlossaryTerm to which that term maps will be assigned the SpineObject classification. |
SubjectArea | The provided implementation assigns a SubjectArea classification to a GlossaryCategory based on the assigned_to_term relationship of the category in IGC. Specifically, when the category has an assigned_to_term relationship to the IGC term Classifications/SubjectArea , the GlossaryCategory will be assigned a SubjectArea classification whose name will match the name of the IGC category . |
TypeEmbeddedAttribute | The provided implementation adds this classification to all RelationTable , RelationalColumn and TabularColumn instances to simplify the management of type information without requiring additional SchemaType subclasses (like RelationalTableType , RelationalColumnType and TabularColumnType ) and their additional relationships. |
-
All top-level categories in IGC that are not named
Classifications
are considered a Glossary, all categories whoseparent_category
is not null and notClassifications
are considered a GlossaryCategory. ↩ -
NoteLog
is a generated entity, present only for those IGC objects that support notes, and it cannot be searched. ↩ -
NoteEntry
cannot be searched, only retrieved from the NoteLog to which they are related. ↩ -
AssetSchemaType
betweendatabase_schema
objects is a generated relationship (all properties for both endpoints are on a single entity instance in IGC). ↩ -
AttachedNoteLog
betweeninformation_asset
objects is a generated relationship (all properties for both endpoints are on a single entity instance in IGC), and is only present on those IGC objects that support notes. ↩ -
AttachedNoteLogEntry
relationship cannot be searched. ↩ -
CategoryAnchor
relationship is between the ultimate parent IGC category (Glossary) and any offspring. ↩ -
ConnectionEndpoint
is linked through IGC'sconnector
object in the middle ↩ -
ContactThrough
is a generated relationship (all properties for both endpoints are on a single entity instance in IGC). ↩↩ -
DataClassAssignment
is linked through IGC'sclassification
object in the middle, which has some relationship-specific properties. ↩ -
TermAnchor
creates a relationship between the ultimate parent IGC category (Glossary) and an IGC term. ↩