Uploaded by Favorite Mall

datamodelingmetadatamanagementdataversityoctober2016-161031195406

advertisement
Data Modeling & Metadata Management
Donna Burbank
Global Data Strategy Ltd.
Lessons in Data Modeling DATAVERSITY Series
October 27th, 2016
Donna Burbank
Donna is a recognized industry expert in
information management with over 20
years of experience in data strategy,
information management, data modeling,
metadata management, and enterprise
architecture.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specialises in the alignment
of business drivers with data-centric
technology. In past roles, she has served in
a number of roles related to data modeling
& metadata:
Global Data Strategy, Ltd. 2016
• Metadata consultant (US, Europe, Asia,
Africa)
• Product Manager PLATINUM Metadata
Repository
• Director of Product Management,
ER/Studio
• VP of Product Marketing, Erwin
• Data modeling & data strategy
implementation & consulting
• Author of 2 books of data modeling &
contributor to 1 book on metadata
management, plus numerous articles
• OMG committee member of the
Information Management Metamodel
(IMM)
As an active contributor to the data
management community, she is a long
time DAMA International member and is
the President of the DAMA Rocky
Mountain chapter. She has worked with
dozens of Fortune 500 companies
worldwide in the Americas, Europe, Asia,
Follow on Twitter @donnaburbank
Today’s hashtag: #LessonsDM
and Africa and speaks regularly at industry
conferences. She has co-authored two
books: Data Modeling for the
Business and Data Modeling Made Simple
with ERwin Data Modeler and is a regular
contributor to industry publications such
as DATAVERSITY, EM360, & TDAN. She can
be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
2
Lessons in Data Modeling Series
This Year’s Line Up
• July 28th
Why a Data Model is an Important Part of your Data Strategy
• August 25th
Data Modeling for Big Data
• September 22nd
UML for Data Modeling – When Does it Make Sense?
• October 27th
Data Modeling & Metadata Management
• December 6th
Data Modeling for XML and JSON
Global Data Strategy, Ltd. 2016
3
Agenda
What we’ll cover today
• How data modeling fits within a larger metadata management landscape
• When can data modeling provide “just enough” metadata management
• Key data modeling artifacts for metadata
• Organization, roles & implementation considerations
• Summary & questions
Global Data Strategy, Ltd. 2016
4
Metadata is Hotter than ever
A Growing Trend
In a recent DATAVERSITY survey, over 80% of
respondents stated that:
Metadata is as important, if not more
important, than in the past.
Global Data Strategy, Ltd. 2016
5
What is Metadata?
Metadata is Data In Context
Global Data Strategy, Ltd. 2016
6
Metadata is the “Who, What, Where, Why, When & How” of Data
Who
What
Where
Why
When
How
Who created this
data?
What is the business
definition of this data
element?
Where is this data
stored?
Why are we storing
this data?
When was this data
created?
How is this data
formatted?
(character, numeric,
etc.)
Who is the Steward of
this data?
What are the business
rules for this data?
Where did this data
come from?
What is its usage &
purpose?
When was this data
last updated?
How many databases
or data sources store
this data?
Who is using this
data?
What is the security
level or privacy level
of this data?
Where is this data
used & shared?
What are the business
drivers for using this
data?
How long should it be
stored?
Who “owns” this
data?
What is the
abbreviation or
acronym for this data
element?
Where is the backup
for this data?
Who is regulating or
auditing this data?
What are the technical
naming standards for
database
implementation?
Are there regional
privacy or security
policies that regulate
this data?
Global Data Strategy, Ltd. 2016
When does it need to
be purged/deleted?
7
Metadata is Part of a Larger Enterprise Landscape
A Successful Data Strategy Requires Many Inter-related Disciplines
“Top-Down” alignment with
business priorities
Managing the people, process,
policies & culture around data
Leveraging & managing data for
strategic advantage
Coordinating & integrating
disparate data sources
“Bottom-Up” management &
inventory of data sources
Global Data Strategy, Ltd. 2016
8
Metadata Across & Beyond the Organization
• Metadata exists in many sources across & beyond the organization.
COBOL JCL
Legacy Systems
Data Models
Spreadsheets
Media
Databases
Data
In Motion
Documents
Global Data Strategy, Ltd. 2016
Social
Media
Open Data
IoT
9
Types of Metadata
• The DATAVERSITY Emerging Trends in Metadata survey revealed some interesting findings about
what types of metadata organizations will be managing now and in the future.
Now
Future
= Supported by most data modeling tools
Global Data Strategy, Ltd. 2016
10
Data Models are a Good Source of Metadata
• Data Models are another good source of both business & technical metadata for relational
databases.
• They store structural metadata as well as business rules & definitions.
• Key relationships are also stored to provide lineage & impact analysis.
Technical Metadata
Business Metadata
Customer
Customer_ID
CHAR(18) NOT NULL
First Name
Last Name
City
Date Purchased
CHAR(18)
CHAR(18)
CHAR(18)
CHAR(18)
Global Data Strategy, Ltd. 2016
NOT NULL
NOT NULL
NULL
NULL
11
Data vs. Metadata
Customer
First Name
Last Name
Company
City
Joe
Smith
Komputers R Us
New York
Year
Purchased
1970
Mary
Jones
The Lord’s Store
London
1999
Proful
Bishwal
The Lady’s Store
Mumbai
1998
Ming
Lee
My Favorite Store
Beijing
2001
Global Data Strategy, Ltd. 2016
Metadata
Data
12
Data vs. Metadata
Customer
STR01
STR02
TXT123
TXT127
DT01
Joe
Smith
Komputers R Us
New York
1970
Mary
Jones
The Lord’s Store
London
1999
Proful
Bishwal
The Lady’s Store
Mumbai
1998
Ming
Lee
My Favorite Store
Beijing
2001
Global Data Strategy, Ltd. 2016
Metadata?
Data
13
Metadata adds Context & Definition
Customer
First Name
Joe
Mary
Proful
Ming
Last Name
Smith
Jones
Bishwal
Lee
Company
Definition
City
Komputers R Us New York
The Lord’s Store
London
The Lady’s Store Mumbai
My Favorite Store Beijing
Year
Purchased
1970
1999
1998
2001
Business Rules
Format
Abbreviation
Required
Is this the city where the customer lives
or where the store is located?
Global Data Strategy, Ltd. 2016
Etc.
Last Name represents the surname or family name of
an individual.
In the Chinese market, family name is listed first in
salutations.
VARCHAR(30)
LNAME
YES
Numerous technical & business metadata including
security, privacy, nullability, primary key, etc.
14
Technical & Business Metadata
• Technical Metadata describes the structure, format, and rules for storing data
• Business Metadata describes the business definitions, rules, and context for data.
• Data represents actual instances (e.g. John Smith)
Data
Technical Metadata
CREATE TABLE EMPLOYEE (
employee_id
INTEGER NOT NULL,
department_id
INTEGER NOT NULL,
employee_fname
VARCHAR(50) NULL,
employee_lname
VARCHAR(50) NULL,
employee_ssn
CHAR(9) NULL);
CREATE TABLE CUSTOMER (
customer_id
INTEGER NOT NULL,
customer_name
VARCHAR(50) NULL,
customer_address VARCHAR(150) NULL,
customer_city
VARCHAR(50) NULL,
customer_state
CHAR(2) NULL,
customer_zip
CHAR(9) NULL);
Business Metadata
Term
Definition
An employee is an individual who currently
Employee works for the organization or who has been
recently employed within the past 6 months.
A customer is a person or organization who
has purchased from the organization within
Customer
the past 2 years and has an active loyalty card
or maintenance contract.
John Smith
Global Data Strategy, Ltd. 2016
15
Business vs. Technical Metadata
• The following are examples of types of business & technical metadata.
Business Metadata
•
•
•
•
•
•
•
•
Definitions & Glossary
Data Steward
Organization
Privacy Level
Security Level
Acronyms & Abbreviations
Business Rules
Etc.
Global Data Strategy, Ltd. 2016
Technical Metadata
•
•
•
•
•
•
•
•
•
•
Column structure of a database table
Data Type & Length (e.g. VARCHAR(20))
Domains
Standard abbreviations (e.g. CUSTOMER ->
CUST)
Nullability
Keys (primary, foreign, alternate, etc.)
Validation Rules
Data Movement Rules
Permissions
Etc.
16
Levels of Data Modeling
Conceptual
Business Concepts
Business
Metadata
Logical
Data Entities
Physical
Technical
Metadata
Physical Tables
Global Data Strategy, Ltd. 2016
17
Business Definitions
From Data Modeling for the Business by
Hoberman, Burbank, Bradley, Technics
Publications, 2009
Global Data Strategy, Ltd. 2016
Non-Traditional Sources
Not all metadata is in a
relational database
Human Metadata
Avoid the dreaded “I just know”
• Much business metadata and the history of the business exists in employee’s heads.
• It is important to capture this metadata in an electronic format for sharing with others.
• Avoid the dreaded “I just know”
Part Number is what used to
be called Component
Number before the
acquisition.
Business Glossary
Metadata Repository
Data Models
Etc.
Global Data Strategy, Ltd. 2016
20
Data Modeling in the Big Data Ecosystem
Data Sources
JSON / XML
HDFS File System
Structured Data
HQL
Hive HBase
Hadoop Framework
Global Data Strategy, Ltd. 2016
Semi-structured Data
JSON
Unstructured Data
XML JSON
MapReduce / Analytics
Cobol Copybook Metadata
• What is a COBOL Copybook? – In COBOL, a copybook file is used to define data elements that can
be referenced by many programs
• What is COBOL Copybook Metadata? – structure, definition
Metadata
Describes structure & format of
data
Global Data Strategy, Ltd. 2016
22
ERP/CRM and Packaged Application Metadata
• Packaged applications such as CRM and ERP systems (e.g. Salesforce, Peoplesoft, etc.) are
typically based on a relational database system.
• Therefore, there is important metadata about both the physical table structures as well as the
business names & definitions.
Technical Metadata
Global Data Strategy, Ltd. 2016
Business Metadata
23
Relationship Metadata
Showing How
Information Interrelates
Data Lineage - Data Warehousing Example
• In the data warehouse example below, metadata for CUSTOMER exists in a
number tools & data stores.
• This lineage can be tracked in most data modeling tools.
Logical Data Model
Dimensional
Data Model
Physical Data Model
Physical Data Model
CUSTOMER
Database Table
CUSTOMER
ETL Tool
Business Glossary
CUSTOMER
BI Tool
ETL Tool
Database Table
Database Table
CUST
Database Table
TBL_C1
Sales Report
Database Table
Global Data Strategy, Ltd. 2016
25
Metadata Discovery Tools
• Metadata Discovery Tools extract metadata from source systems, and rationalize
them to a common metamodel and storage facility.
Metadata Discovery
Tools
Metadata
Population
Metadata Storage
(Repository)
Metamodel(s)
Metadata Storage
(Database)
Global Data Strategy, Ltd. 2016
26
Impact Analysis & Where Used
• Impact Analysis shows the relationship between a piece of metadata and other sources that rely
on that metadata to assess the impact of a potential change.
• For example, if I change the length & name of a field, what other systems that are referencing
that field will be affected?
What happens if I change the name &
length of the “Brand” field?
Customer
Database
Oracle
Sales Application
Brand CHAR(10)
MyBrand VARCHAR(30)
Sales Database
DB2
Global Data Strategy, Ltd. 2016
ETL
Staging Area
27
Design Layer Relationships
• In a data model there are several design layers that describe a given data
concept.
Global Data Strategy, Ltd. 2016
28
Organization, Roles &
Implementation Considerations
Ensuring that metadata is used effectively
across the organization
Who Uses Metadata?
• In addition to sharing metadata between tools and via export,
many users across both IT & the business want to view the metadata through
reports, portals, etc.
If I change this field, what
else will be affected?
Developer
What’s the definition of
“Regional Sales”
Business Person
(e.g. Finance)
Global Data Strategy, Ltd. 2016
What is the approved data
structure for storing
customer data?
Data
Architect
How was “Total Sales”
calculated? Show me the
lineage.
Auditor
What are the source-totarget mappings for the
DW?
Data
Warehouse
Architect
How can I get new staff upto-speed on our company’s
business terminology?
Business Person
(e.g. HR)
30
Metadata is Needed by Business Stakeholders
Making business decisions on accurate and well-understood data
80% of users of metadata are from
the business, according to the
recent DATAVERSITY survey.
Global Data Strategy, Ltd. 2016
31
Metadata Publication & Reporting – Business Glossary
• A Business Glossary is a common way to publish business terms & their definitions.
• When sourced from a common repository, these terms are integrated with the wider data
landscape.
• Most data modeling tools can take the definitions from Logical and/or Conceptual data models
and publish them to a Glossary-style format, via web portals or reports.
Business Term
Data Steward
Security Level
BFPO Number
Abbreviation Definition
BFPO Number is for British Forces Postal Office. It can be
BFPO Num
used in UK and overseas addresses.
Accounting
Unclassified
Interest
Int
Finance
Unclassified
PO Box
POB
Accounting
Unclassified
The growth in capital of a monetary investment
A numbered box in a post office assigned to a person or
organization, where mail for them is kept until collected
A feedback mechanism is important to gather valuable input & updates
from users.
Global Data Strategy, Ltd. 2016
32
Metadata Publication & Reporting – Lineage
• Data Lineage can be visualized through a web portal or reports.
• With web-based reporting, users can drill-down into each data source and investigate further
lineage.
Global Data Strategy, Ltd. 2016
33
Metadata Publication & Reporting – Data Structures
• Having a common view of standard data structures is helpful for data architects, developers, etc.
• This can all be sourced from a data model.
Table Name
CUSTOMER
ORDER
Column Name Attribute Name Data Type
Nullability
Primary Key
Definition
CUST_ID
Customer Identifier VARCHAR(20)
NOT NULL
Yes
Customer ID is the unique identifier that
locates a customer
F_NAME
First Name
VARCHAR(30)
NOT NULL
No
The given name of an individual
L_NAME
Last Name
VARCHAR(40)
NOT NULL
No
ORDER_ID
Order Identifier
VARCHAR(10)
NOT NULL
Yes
The family name of an individual
The number assigned to an order from
the FIX10 system that locates a unique
order.
Etc.
Global Data Strategy, Ltd. 2016
34
Data Models can provide “Just Enough” Metadata Management
• While data modeling tools are not metadata repositories, nor designed to be, they offer many features shared with these
repository solutions:
•
Metadata storage, Data lineage visualization, Business Glossary, Integration with BI tools, ETL tools, etc.
• Metadata repositories have a broader range metadata sources & dedicated metadata management support.
• And Data Modeling tools, of course, have the added benefit of doing data modeling! 
•
And the benefit is that much of the needed metadata is in these data models.
Metadata
Storage
Data Modeling Tools
(e.g. Erwin, SAP
PowerDesigner, Idera
ER/Studio)
x
Metadata Repositories (e.g.
ASG, Adaptive)
X
Data Governance Tools (e.g.
Collibra, Diaku)
x
Spreadsheets
x
Global Data Strategy, Ltd. 2016
Metadata
Lifecycle &
Versioning
X
Data Modeling
Metadata
Discovery &
Integration w/
Other Tools
Customizable
Metamodel
X
X
x
X
X
X
X
x
x
Data Lineage
Visualization
Business Glossary
X
x
X
x
x
x
35
Key Components of Metadata Management
Metadata Strategy
Metadata Capture &
Storage
Metadata Integration &
Publication
Metadata Management &
Governance
Alignment with business goals
& strategy
Identification of all internal &
external metadata sources
Identification of all technical
metadata sources
Metadata roles &
responsibilities defined
Identification of & feedback
from key stakeholders
Population/import mechanism
for all identified sources
Identification of key
stakeholders & audiences
(internal & external)
Metadata standards created
Prioritization of key activities
aligned with business needs &
technical capabilities
Identification of existing
metadata storage
Integration mechanism for key
technologies (direct
integration, export, etc.)
Metadata lifecycle
management defined &
implemented
Prioritization of key data
elements/subject areas
Definition of enterprise
metadata storage strategy
Publication mechanism for
each audience
Metadata quality statistics
defined & monitored
Feedback mechanism for each
audience
Metadata integrated into
operational activities & related
data management projects
Communication Plan
developed
Global Data Strategy, Ltd. 2016
36
Implementing a Metadata Strategy
• A successful metadata strategy requires input from multiple factors.
Business Drivers & Motivation
Stakeholders & Audience
Metadata
Strategy
Metadata Management Maturity
Metadata Sources & Technology
Global Data Strategy, Ltd. 2016
37
Stakeholder Feedback
• Determine key business issues & drivers through direct feedback.
There is limited ownership or
enforcement of common
practices and standards
across the projects
I didn’t know we had any
documented data
standards
We have 15 customer
databases – with many
duplications.
I just joined the company and
don’t understand all of the
acronyms!
I need a central, accurate
view of all my customers
worldwide.
Global Data Strategy, Ltd. 2016
There was an error in reporting
products by customer & region
that was noticed by upper
management.
$12m has been spent on
projects to clean up the data
over the past 2-3 years
Where do I go to get the
definition of “default
banking standard”?
Key subject matter experts
are relied upon to review
detailed data from various
systems to ensure accuracy.
What are the data structures
used in the application?
38
Mapping Business Drivers to Metadata Management
Capabilities
Stakeholder Challenges
1
Business Drivers
External Drivers
Digital
Self Service
Online Community &
Social Media
Increasing Regulatory
Pressures
Community Building
Brand Reputation
Metadata Strategy
Integrating Data
• Siloed systems
• No common view of key information
33
Efficient IT
• System redundancy
• No reuse or standards
2
3
4
5
6
Metadata Capture &
Storage
2
3
4
5
6
Metadata Integration &
Publication
1
2
3
4
5
Shows “Heat
Map” of Priorities
6
Metadata Management &
Governance
No Audit Trails
• No lineage of changes
• Fines had been levied in past for lack of
compliance
66
1
Data Quality Issues
4 Cost of Data Management
•4 Manual entry increases costs
5
360 View of Customer
Metadata Capability
2
• Bad customer info causing Brand damage
• Completeness & Accuracy Needed
Internal Drivers
Targeted Marketing
Lack of Business Alignment
• Data spend not aligned to Business Plans
• Business users not involved with data
2
3
4
Big Data Exploitation
• Exploiting Unstructured Data
• Access to External & Social Data
Global Data Strategy, Ltd. 2016
39
Inventory & Usage Mapping
• It’s also important to determine which teams are using these technologies to
create a “heat map” of usage & priority.
Metadata Sources
Leadership
Sales
Finance
Marketing Support
R&D
HR
Legal Compliance
X
X
X
X
X
X
X
X
X
X
Relational Databases
MySQL
X
Oracle
X
X
SQL Server
X
X
Sybase
X
X
Etc.
BI Tools
Tableau
Qlik
X
X
X
X
Etc.
Open Data
Data.gov – agricultural data
X
X
X
Etc.
Global Data Strategy, Ltd. 2016
40
Metadata Roles & Responsibilities
• It’s important to establish formal roles & responsibilities for your metadata effort.
• Some may be part-time, and some full-time, but they should be clearly defined and
communicated so that staff has understanding of and accountability for their roles.
• Executive Sponsor/Champion: Understands & communicates the importance of metadata
management across the organization.
• Steering Group: As part of a metadata management effort, or part of a larger data governance effort,
the steering group prioritizes & sets direction for key activities.
• Data Stewards: Responsible for business definitions & rules for key data elements.
• Metadata Repository Administrator: Manages the administration, population, and interfaces of a
metadata repository.
• Metadata Publicist: Establishes reports & publication methods to end users.
• Metadata Consumers: Actively use metadata as part of their daily jobs, and are held accountable for
using published standards.
• Data Modelers
• Developers
• Business Users
• Report Developers
• Etc.
Global Data Strategy, Ltd. 2016
41
Monitoring Metadata Quality & Metrics
• Metadata is a key driver of data quality, and to support this, the metadata itself must be of high
quality.
• In order to ensure that quality metadata is maintained, it must be actively managed and
monitored. Dashboards & Reports can be used to monitor key quality indicators.
• Key metadata quality indicators include:
• Completeness: e.g. Do definitions exist for all key data elements?
• Accuracy: e.g. Are current definitions correct? Do data types accurately represent currently
implemented standards?
• Currency/ Timeliness: e.g. Are metadata definitions current or outdated?
• Consistency: e.g. Are metadata standards defined, published & implemented consistently across the
organization?
• Accountability: e.g. Are data stewards or owners defined?
• Integrity: e.g. Are linkages and relationships established between critical metadata items?
• Privacy: e.g. Is any metadata subject to privacy restrictions?
• Usability: e.g. Are people actually using this metadata?
Global Data Strategy, Ltd. 2016
42
Summary
• Metadata is more important than ever
• Data models are a rich source of metadata
• While metadata repositories are valuable, data models & associated functionality can often
provide “just enough” metadata management
•
•
•
•
Business definitions
Technical data structures
Data lineage & impact analysis
Visual models
• Organizational considerations are critical to achieve success
• Understanding business drivers
• Defining roles & responsibilities
• Monitoring metadata quality & metrics
• Have fun! Metadata is for the cool kids.
Global Data Strategy, Ltd. 2016
About Global Data Strategy, Ltd
Data-Driven Business Transformation
• Global Data Strategy is an international information management consulting company that specializes
in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
Global Data Strategy, Ltd. 2016
44
Contact Info
• Email:
donna.burbank@globaldatastrategy.com
• Twitter:
@donnaburbank
@GlobalDataStrat
• Website:
www.globaldatastrategy.com
• Company Linkedin:
https://www.linkedin.com/company/global-data-strategy-ltd
• Personal Linkedin:
https://www.linkedin.com/in/donnaburbank
Global Data Strategy, Ltd. 2016
45
White Paper: Emerging Trends in Metadata Management
Free Download
• Download from www.dataversity.net
• Under ‘Whitepapers’
Global Data Strategy, Ltd. 2016
46
DATAVERSITY Training Center
Online Training Courses
New Metadata Management Course
• Learn the basics of Metadata Management and practical tips on how to apply metadata
management in the real world. This online course hosted by DATAVERSITY provides a series of six
courses including:
•
•
•
•
•
•
What is Metadata
The Business Value of Metadata
Sources of Metadata
Metamodels and Metadata Standards
Metadata Architecture, Integration, and Storage
Metadata Strategy and Implementation
• Purchase all six courses for $399 or individually at $79 each.
Register here
• Other courses available on Data Governance & Data Quality
Visit: http://training.dataversity.net/lms/
Global Data Strategy, Ltd. 2016
47
Lessons in Data Modeling Series
Join us next time
• July 28th
Why a Data Model is an Important Part of your Data Strategy
• August 25th
Data Modeling for Big Data
• September 22nd
UML for Data Modeling – When Does it Make Sense?
• October 27th
Data Modeling & Metadata Management
• December 6th
Data Modeling for XML and JSON
Global Data Strategy, Ltd. 2016
48
Questions?
Thoughts? Ideas?
Global Data Strategy, Ltd. 2016
49
Download