In many projects, we observe that a lot of resources are invested in the creation of data models and reports. At the end of the project, however, little budget remains for training the people who will use these reports. When the consultants leave the company after the hypercare phase or the knowledge holders leave due to normal fluctuation, many companies find themselves in a tight spot. And even many years after the implementation, the definition and true meaning of certain key figures are still being debated.
With the help of SAP Datasphere Catalogs, SAP wants to solve these and similar problems. The end user from the business department should be able to find and evaluate the required data quickly. The data catalog also provides information about the origin of the data, explains the dimensions used and provides information about the calculation of the key figures. This increases data quality within the company and creates a common understanding of the business language used.
In this article we provide an overview of the functions of the data catalog within SAP Datasphere, highlight the possible areas of application and provide an outlook on future developments.
Definition, objectives and requirements
What is a data catalog anyway? It is a software that enables the cataloguer to create efficient and structured documentation of meta information on data products such as tables, views and stories. Based on this documentation, the consumer is able to carry out an intuitive and quick evaluation.
The business know-how should be documented, stored centrally and linked to relevant data products. The aim is to ensure that consumers can reliably find the right data products.
This results in the following requirements. For cataloguer, efficient documentation through reusability, templates and links from documents to data products is essential. For the consumer, on the other hand, the focus is on intuitive and detailed searching as well as reliable and fast evaluation of the required data products. For example, if a user is looking for a specific key figure, they must be able to quickly find a data product that contains the relevant information.
Data catalog elements
This is where the SAP Datasphere data catalog comes into play. It provides extended information about data products (assets) from SAP Datasphere and SAP Analytics Cloud (SAC). An asset is a technical object such as a remote or local table, view, data flow or a story.
These data products are enriched with elements such as KPIs, glossary terms and tags for better understanding. Together with the properties, description and origin, these elements provide helpful meta information about the respective asset.
All elements are managed using hierarchies. For example, the KPIs can be grouped into different categories such as financial and sustainability key figures. The creation of KPIs and glossary terms is facilitated by templates. It is also possible to create user-defined tabs and attributes that can be filled in as part of the definition of an associated KPI or term. For example, a mandatory, user-defined attribute "Approved By" could be defined in a glossary, which contains a list of business know-how providers. When creating an associated term, the cataloguer would have to select an entry (approver) from the list. In this way, numerous additional attributes can be documented based on the company's individual requirements.
KPIs are the technical definition of the respective key figures. In addition to the detailed explanation, the formula used for the calculation is also displayed. Threshold values can also be defined. This makes it immediately clear what is considered good target achievement and where the critical value has been exceeded and a review is necessary.
This creates a common understanding of what the respective KPI means and whether the business objective has been achieved or not. Which in turn increases confidence in the KPI.
Download the whitepaper and find out
which product is best for your data warehousing strategy
In addition to the key figures, terms are used to explain the terminology used in the organization. A centrally defined business glossary contributes to a uniform understanding of business terms. In addition to the technical definition of the respective terms, keywords and synonyms can also be added to facilitate the search.
The use of tags allows you to better understand the data and display relationships to other assets. Users can easily find all assets to which a specific tag is assigned. In addition, tags can make statements about data quality.
All these elements are assigned to the respective data product and, in addition to the properties and descriptions, provide a better information basis for making more informed decisions.
The combination of terms, tags and KPIs helps to understand the data at hand and provides additional context. This helps to assess whether this asset meets the user's requirements or not.
In this context, the option to display the origin of the data is another helpful function. In the Legacy tab, the data origin of the respective asset is visualized across multiple layers and systems. You can also see where this asset is used.
Data governance lifecycle
Now that you have internalized the basic principles of the data catalog, we can take a look at a typical workflow. This workflow comprises five steps. First, you connect the data catalog to the source system to automatically extract the metadata available there and create the assets. This is a one-off effort. The subsequent updates to the metadata are carried out without any action on your part. Currently only SAP Datasphere and SAP Analytics Cloud systems are supported. In the future, other systems such as SAP BW, SAP HANA and SAP S/4HANA as well as ECC will be added.
The knowledge available in the company about the business processes is then collected and cataloged using KPI definitions, glossary terms and tags. The first part is certainly the most time-consuming, as all of this is done manually. The data products (assets) from the first step are then enriched with the previously created metadata. These steps are carried out by the catalog administrators.
The catalog users can then search for the desired data based on their business expertise using defined terms, tags and KPIs. Since these have previously been linked to the data products by the administrators, users are more likely to find the right views and stories. The search is easier and more precise. Users can use the metadata to decide whether they have found the right asset and start their analysis.
SAP invests long-term in the data catalog and thus guarantees continuous further development. In the short term, the aim is to offer the key functions that are still missing, such as authorizations and transport systems. In the medium term, more source systems will be supported: SAP BW, SAP HANA and the ERP systems SAP S/4HANA as well as ECC. In the long term, the integration between Data Catalog and Data Marketplace is the central construction site from SAP's perspective.
SAP Datasphere catalog - Our Summary
The Datasphere data catalog offers solutions to many problems that we observe in the practice of data management. It creates an efficient, centralized repository for company-wide definition of terms and KPIs. By linking these definitions to data products, it enables end users (based on their business know-how) to find the correct models and reports that are relevant to their current use case. Technical knowledge of the architecture of the data models is no longer necessary. As a result, the Datasphere data catalog makes a significant contribution to breaking down data silos.
Do you have questions about SAP Datasphere? Do you want to convert your transformation routines to SQLScript and are looking for experienced developers with SQLScript know-how? Please do not hesitate to contact us.