What is MetaData? Is MetaData needed in Data Warehouse?

Tram Ho

Metadata is data about data or documents about information requested by the user. In Data Warehouse, metadata is one of the essential aspects.

Metadata includes the following:

  • Location and description of the warehouse system and components.
  • The name, definition, structure, and content of the Data Warehouse and end-user views.
  • Identify authoritative data sources.
  • Integration and transformation rules are used to populate the data.
  • Integration and transformation rules are used to provide information to end-user analytics tools.
  • Subscription information to provide information to subscribers for analysis.
  • Metrics are used to analyze warehouse usage and performance.
  • Security authorization, access control lists, etc.

Metadata is used to build, maintain, manage, and use the Data Warehouse. metadata gives users access to help understand content and find data.

Some examples of metadata are:

  • Library catalog can be thought of as metadata. The metadata directory consists of a number of predefined elements that represent specific properties of the resource, and each item can have one or more values. These components can be the name of the author, the name of the document, the name of the publisher, the date of publication, and the methods to which it belongs.
  • Tables of contents and indexes in books can be thought of as metadata for books.
  • Suppose we say that a data item about a person is 80. This must be determined by noting that it is the weight of the person and the unit is kilograms. Therefore, (weight, kilogram) as metadata about the data is 80.
  • Another example of metadata is data about tables and figures in a report like this one. A table (which is a record) has a name (e.g. table header) and has table column names that can be considered metadata. Metrics also have titles or names.

Why is metadata needed in Data Warehouse?

  • First, it acts as the glue that binds all parts of the Data Warehouse.
  • Next, it provides content and structure information to developers.
  • Ultimately, it opens the door to the end user and makes the content recognizable to their understanding.

Metadata is like a nerve center. The various processes in the data warehouse construction and management process generate portions of the Data Warehouse metadata. Another process uses parts of the metadata generated by one process. In Data Warehouse, metadata plays an important role and allows communication between different methods. It acts as a nerve center in the Data Warehouse.

Figure showing the location of metadata in the Data Warehouse. image.png

Types of metadata

The metadata in the Data Warehouse is divided into three main parts:

  • Operational Metadata
  • Extraction and Transformation Metadata
  • End-User Metadata

Operational Metadata

As we all know, data for Data Warehouse is taken from many different operating systems of the enterprise. These source systems consist of different data structures. The data elements selected for the Data Warehouse have different field lengths and data types.

When selecting information from the source system for the Data Warehouse, we split records, combine document elements from different source files, and deal with multiple encoding schemes and field lengths. When we make information available to end users, we must be able to link that information back to the source data sets. The activity metadata contains all this information about the activity data sources.

Extraction and Transformation Metadata

The extraction and transformation metadata includes data about data removal from the source system, specifically the extraction frequency, extraction method, and business rules for data extraction. In addition, this metadata catalog contains information about all data transformations taking place in the data area.

End-User Metadata

The end-user metadata is the navigation map of the Data Warehouse. It allows end users to find data from Data Warehouses. End-user metadata allows end-users to use their business terminology and find information in ways they would normally think of a business.

Metadata Exchange Initiative

The proposed Metadata Exchange Initiative aims to bring industry providers and users together to solve many of the critical issues and problems associated with metadata exchange, sharing, and management. The goal of the metadata exchange standard is to define an extensible mechanism that will allow vendors to exchange standard metadata as well as carry “proprietary” metadata. The founding members agreed on the following initial goals:

  • Create vendor-independent, industry-defined and maintained standard access mechanisms and application programming interfaces (APIs) for metadata.
  • Allows users to control and manage access and manipulation of metadata in their unique environment through the use of exchange standards-compliant tools.
  • Users are allowed to build tools that meet their needs and will also issue
  • Blend them to adjust to those tool configurations.
  • Allows individual tools to satisfy their metadata requirements freely and efficiently within the content of the exchange model.
  • Describe a clean, simple implementation infrastructure that will facilitate compliance and speed adoption by minimizing the number of revisions.

To create a procedure and process not only to maintain and establish the exchange standard specification but also to update and extend it over time.

Metadata exchange standard framework

The standard implementation of the exchange metadata model assumes that the metadata itself can be stored in a storage format of any kind: ASCII file, relational table, fixed or custom format, etc.

It is a framework based on a framework that will convert an access request into a standard exchange index.

Several approaches have been proposed in the metadata exchange alliance:

  • Procedural Approach
  • ASCII Batch Approach
  • Hybrid Approach

In the procedural approach, communication with the API is built into the tool. It allows the highest degree of flexibility.

In the ASCII Batch approach, instead of relying on an ASCII file format containing information of various metadata items and standardized access requirements make up the exchange standard metadata model.

In the way of Hybrid approach, it follows the data-driven model.

Components of the Metadata Exchange Framework

Standard Metadata Model : It refers to the ASCII file format, which is used to represent the metadata being exchanged. image.png

Standard access framework: describes the minimum number of API functions.

Tool profile: provided by each tool vendor.

**The user configuration:** is a file that explains the legal exchange paths for metadata in the user’s environment.

Metadata Repository

The metadata itself is stored and controlled by the metadata repository. Metadata warehouse management software can be used to map source data to destination databases, integrate and transform data, generate code to transform data, and move data into the warehouse.

Benefits of Metadata Repository

  • It provides a set of tools for enterprise-wide metadata management.
  • It eliminates and minimizes inconsistencies, redundancies, and underutilization.
  • It improves an organization’s controllability, simplifies the management and accounting of information assets.
  • It increases coordination, understanding, identification and use of information assets.
  • It enforces CASE development standards with the ability to share and reuse metadata.
  • It drives investment in legacy systems and uses existing applications.
  • It provides a relational model for heterogeneous RDBMSs to share information.
  • It provides a useful data governance tool for managing a company’s information assets with a data dictionary.
  • It increases the reliability, controllability and flexibility of the application development process.

Source: Internet

To understand more as well as consult more knowledge about data, everyone can visit https://indaacademy.vn/blog/ . Thank you very much. See you all in the next knowledge.

Share the news now

Source : Viblo