Data Governance Interview - Alex Leigh

alex_leigh.jpg

Alex has worked with over fifty UK universities, most of the sector agencies including University and Colleges Admissions Service (UCAS), Higher Education Statistics Agency (HESA) and the Quality Assurance Agency for Higher Education (QAA) and a host of practitioners in the Higher Education (HE) sector. Alex designed and developed the HEDIIP data capability framework, led the team to create the HESA in-year collection model and designed both a sector level and a HESA instance of a best practice data governance approach; and is currently working with universities to develop and implement their Data Governance frameworks,

How long have you been working in Data Governance?

I was working in DG before anyone really called it DG! Around fifteen years via data architecture and running data management teams.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

My route in was via developing and implementing Enterprise Architecture frameworks while I was working in Deloitte UK consulting practice. There was a lot of maturity around the business, infrastructure and application architecture but data felt very fragmented between the technology of building warehouses and databases and the link to the business objectives.

So I focused my efforts on creating frameworks and approaches to pragmatically align how data was managed to how it was used. That was fifteen years ago and I’m still trying to change the perception around data.

What characteristics do you have that make you successful at Data Governance and why?

Passion for the right outcomes, resilience when I can’t achieve them and an ethos for openness, transparency and showing everyone why doing things differently isn’t just about the organisation, it’s about helping them.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

I’m a big fan of John Ladely and use his books as my primary source of reference. Bob Seiner has some great ideas about implementing DG and I’m an avid reader of TDAN. I also carry around the latest DMBOK which is a big improvement on the original in terms of applicability.

There’s so much good stuff online now. I’d encourage any new practitioners to read and research lots of different ideas, as no one size fits all.

What is the biggest challenge you have ever faced in a Data Governance implementation?

Believing everyone was as passionate, sold and committed as I was to doing things differently. So losing support once the initial enthusiasm had lapsed. Both from those working with data every day and those who had the budget, resource and will to help change the way data was governed.  Now I am very careful to make sure everyone is starting in the same place, and make far less assumptions around priorities.

Is there a company or industry you would particularly like to help implement Data Governance for and why?

I love working in Higher Education (HE). It’s collaborative, desperately in need of the professionalization of its data assets, and – finally – ready to make some of these changes. I’ve worked in 7 industry sectors, but I’ll never move out of HE. If I can help make a difference to this sector, it feels a really important thing to do. Having kids about to enter the HE environment makes this personal as well as important.

What single piece of advice would you give someone just starting out in Data Governance?

Don’t try to fix everything at once. At the heart of good DG is changing behaviours of many people. This is not a simple thing. Talking about frameworks or technology and the like early in a DG initiative is not helpful. Find some people who share your passion for data, look for quick wins where doing things differently makes a measurable change and communicate way more than you think you need to.

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

I once asked an attendee at a workshop why she looked so glum. Her reply was "This is how I feel about data. I work in planning and I know at the top of the ‘data mountain’ it is sunny and lovely and all the data is perfect.  The data gets to me in this dark valley at the bottom of the mountain via a stream that all the sheep has wee’d in. That’s why I look so sad"

It was an amusing anecdote but made me realise that you really have to show people how things can be better, before actually asking them to do something about it. 

Data Governance Interview - Dr. Irina Steenbeek

irina_steenbeek.jpg

Irina is a dedicated Senior Data Management, Financial and IT Professional with 15+ years of extensive experience in data management, software implementation, financial and business control, and project management. During the past year, she has also immersed herself in data science. She sees that big data and data science have a huge potential for business development. These areas are crucial for the development of data management, including data governance.

How long have you been working in Data Governance?

It has been over 7 years now, since I first started working in data management. During this time, I have also engaged a lot in Data Governance, mostly through hands-on experience in SMEs and large international organizations.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

For me, this career path started quite organically as a result of several challenges in related areas. The first challenge was to create an automated solution for management accounting reporting within an organization. During the development of an organization-wide reporting system, I came across various issues, such as massive numbers of non-reconcilable Excel reports and the presence of three separate reporting platforms. This lead to the implementation of a data warehouse in this company. This implementation also highlighted data quality issues. This was not all, but it started a long journey towards the management of the data.

What characteristics do you have that make you successful at Data Governance and why?

I think that well-developed communication skills always come in handy. Establishing a common business language between your colleagues and partners, as well as clear communication with the stakeholders helps a lot if you want to set up a good data management structure. A data management function connects various departments of a company, and it is great if you can get practical support of top management, or work in an environment and culture that can enable you to execute your function in the most effective way possible.

A while ago I have written a blog on this topic: ‘Am I a successful Data Manager’ Feel free to consult it for more tips that I have to share.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

Recently I came across The Practitioner’s Guide to Data Quality Improvement by David Loshin, published by Morgan Kaufmann in 2010. I think this book is quite useful for DQ professionals.

But in general, from my practical experience of working closely with business stakeholders, I have reached one conclusion: people are not interested to know what they need to do. They are eager to know how to do what is expected from them.

At an early stage of my professional development I used DAMA DM-BOK as my main reference point. And of course, I have done a fair share of internet browsing and analyzed the materials I could find, originating from numerous sources. So far, I haven’t been able to find any practical guide that could help starting data management professionals.

This experience brought me to an idea to write my own ‘The Data Management Cookbook’, a brief summary book on Data Management, which is here. The extended ‘Do It Yourself’ Data Management guide will be available in June 2018.

What is the biggest challenge you have ever faced in a Data Governance implementation?

There were several, it is difficult to say which one was the biggest or most important. The first one is finding your own way to implementation of data management, and figuring out where to start. The second one is convincing data owners that they have to take on responsibility for their data. And the third one is more technical: the investigation of data quality issues caused by applications.

Is there a company or industry you would particularly like to help implement Data Governance for and why?

In my career, I have worked in all kinds of companies, large and small. All in all, I think I prefer small and medium businesses to large and famous multinationals.  Why? I know it doesn’t sound very ambitious. Smaller companies feel more like a ‘family’. And I appreciate this a lot, as this feeling creates a great work environment. You know everyone and you can get things done quickly and efficiently. It is easier to involve your colleagues in all the processes, which is crucial for successful data management. Even if the resources are limited, you can still deliver tangible results in a short period of time.

What single piece of advice would you give someone just starting out in Data Governance?

You do not set up data management just for fun. You do it because you have certain

needs, a certain goal will drive you to do it. GDPR or data quality are the best examples.  My advice is to first get a clear picture what your goal, your driver is. As soon as you know the direction you should be working in, it will be easier to figure out which components will need most of your attention and in which order you should proceed.

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

When implementing data management, you have to ensure collaboration between various stakeholders. These are people with different interests, needs and tempers, and you are the one who has to combine them all into one effective working system. One of the most memorable moments in my career was a question of one of such stakeholders: ‘Where do you get so much patience to permanently stay calm, have good relationships with all of us and still keep going?’. I guess only a data manager would know. :-)

Data Governance Interview - Matt Becker

MBecker-Headshot-300x300.jpg

 Currently, Matt serves as the Managing Director of Sullexis’ Enterprise Data Strategy and Solutions practice. He has spent 18+ years creating and implementing strategies that drive client performance through technology adaptation in areas ranging from big data to enterprise data management to business intelligence and analytics.  He enjoys delivering the value gained by implementing solid information management principles, thereby reducing inefficiencies and gaining insight into overall operational performance.

How long have you been working in Data Governance?

I spent the early part of my career helping to design and implement various data warehouses and analytics for the Energy Trading & Risk Management industry.  Because of the regulations involved in ETRM projects, starting around 2005, I started incorporating data management procedures and reporting requirements to identify, verify, and track the adherence to the data regulations associated with those DW initiatives.   In fact, every data project that I now oversee, has had some sort of DG set of deliverables.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

It was a combination of by accident and necessity.  Because of the reporting and visualization work I was doing on various BI / Data Warehouse implementations for the Energy industry early in my career, I started seeing a pattern emerge.  The data quality after the go-live of the data warehouse was usually very high but quickly degraded.  Many times, this was a result of the tools being used and the people involved not following a standard approach or adhering to an agreed set of guidelines to maintain the availability, usability, integrity and security of the data.

So, I typically became the person that would build that data quality discipline into the project, usually in the form of a roles and an accountability matrix.  This matrix would define the data requirements and standards needed for ongoing support and maintenance.  Over time this evolved into working with a number of clients on specific DG initiatives helping to provide a framework and a playbook for the overall management of data to drive quality, consistency, and usable insights.  

What characteristics do you have that make you successful at Data Governance and why?

Good blend of having been a business analyst, developer, tester, architect, and project manager. Having done all of these roles throughout my career has given me valuable exposure to all aspects of a solution lifecycle (planning, design, architecture, code development, testing, implementation, and deployment).  Understanding of this framework for a typical project enables you to apply a similar framework and approach for Data Governance methodology (i.e. planning of roles needed, design of the roles and processes to be used, active communication and coordination across both IT and Business functions).

In addition, deep domain knowledge is an added plus in helping to shape the data governance priorities between the IT systems and the business operations.  For example, I have spent quite a few years in the Upstream Energy sector, where Well Data mastering is a significant challenge.  Too many times, companies focus on just throwing technology at the problem to try to solve their data availability and reliability issues.   However, the issue was that the data in system A did not match to system B because the data wasn’t properly defined (i.e. a Well Legal Name vs. Well Short Name, or Spud Date vs. Drill Date) resulting in more expense and time than what was really needed.  In these situations, the company in question has not prioritized their efforts to first develop a common Well language, using DG standards, to ensure a foundational understanding of the data.  Too often, when this step is done first, portions of the existing technology and solutions available in-house can typically be repurposed, saving on the overall expense and allocating those funds to more prescriptive technological solutions needed.

In fact, at Sullexis, we have a section of blogs on our website (http://sullexis.com/blog/) that talk about combining the right data practices with technology to improve such things as data migrations and on-going data quality through practical DG practices.  One of our most recent blogs focuses on creating a common data language to ensure there is a mutual understanding of core concepts central to the company’s operations.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

  • DAMA’s DMBOK v2 – A fundamental guide to data governance
  • Manager Tools.com (https://www.manager-tools.com/) – you have to understand how to manage teams, individuals, projects, and processes in order to be effective in implementing data governance.
  • https://www.nicolaaskham.com/ - your website has a lot of great articles and blogs that serve as a very good aggregator of data governance knowledge from a broad and varied set of sources.
  • Practical Data Migration (Johnny Morris) – great book for working data migration efforts, which is many times how Data Governance gets introduced into an organization.
  • Visualization and Reporting Tools:  Tableau, Spotfire, Business Objects, MS Power BI – understanding how reporting and analytic tools function, are implemented, and how end users utilize them to properly is important to know how to combine their use with DG methodology.

What is the biggest challenge you have ever faced in a Data Governance implementation?

The big challenge is the same one I face at almost every client I work with where Data Governance is a new concept…understanding.  There are many companies that don’t realize that one of their most valuable and key assets is their data.  If you think of data as a garden, one needs to take the time to remove the weeds (bad data), clean out the clutter and debris (duplication), till and care for the soil (managing your technology stack), and properly feed and water (define and execute your roles and procedures).  These on-going activities (data governance process) result in the garden yielding a good crop (high quality, reliable data).   I like to take the time to engage and educate the right levels in the organization to solidify their DG understanding and ultimately gain their support.

Is there a company or industry you would particularly like to help implement Data Governance for and why?

Oil and Gas/Upstream Energy – first this is where I have spent quite a bit of my career, but secondly, many of the newer technologies (Big Data, Cloud, IoT, etc.) are just now being implemented, and there is an explosion of data and the need to better govern its use.  Executives are realizing the need to treat data as an asset and with that you must have the right governance and coordination between people, process, and technology to keep your employees safe, your solutions effective, and your operations competitive.

What single piece of advice would you give someone just starting out in Data Governance?

One of the best bits of advice that I received from my very first project manager about a year after I graduated college was "I want you to be the 'Go-To team member'."

Be the one who is willing to do the task nobody else will. Dig in to the details of the data processes and the business users that use them on a daily basis.   

Be the one who may not know the answer but will go do the work to find out.  Get into the weeds of the data issues or challenge so that you understand the root cause.  This will help you identify the data governance approach to employ to accounts for those issues.

Be the one that others will look to for a good attitude, a positive outlook, and assurance the job will get done and get it done well.  You will be amazed how well people will respond to you if you are positive to them…and many times when dealing with data issues, you need to stay positive!

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

I was at a recent data visualizations conference and many of the sessions were focused on technology improvements like better leveraging IoT technologies in daily operations, no-SQL solutions for data aggregation and consolidation, and new visual modeling techniques using R-based tools.  But the most heavily attended session was about a company’s data governance journey and how they changed their culture and their methodologies to focus on data as the cornerstone of their operations.  The questions that I heard after that session warmed my heart.  So many times, it takes a large undertaking just to get people to understand why data governance is so important, but now the conversations were focused on the how and not the why or the what.  Everyone wanted to know how they could go about implementing data governance in their organization.  I think that is representative of a movement in global corporate community…this increased focus on data.  I think this is just the beginning of a data governance revolution!

 

Data Governance Interview - Suzanne Coumbaros

suzanne_coumbaros.jpg

Suzanne is a fellow DAMA UK committee member and a data management professional with many years of experience in data management including governance, architecture, data warehousing, business intelligence, data quality, data development and data strategy. Originally a computer programmer, statistician and mathematician from Cumbria UK, she has worked for many government led organisations and well-known public and privately own companies both across the UK and Africa. Her background comes from having created data management teams in different organisations and countries.

How long have you been working in Data Governance?

I have worked in data management roles for the last 20 years and specifically in Governance lead roles for nearly 10 years.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

Having spent many years solving corporate and law enforcement agencies data issues, I designed a number of data warehouses and began to investigate best practice data management. I came across DAMA and soon their DMBOK became all I read. It resonated with all the areas of data management I had already fulfilled and I was intrigued by the central function of Data Governance which binds these altogether. As I learnt more about this area I worked hard to put what I had learnt into practice with my first governance role. It was a great learning curve as I quickly understood that changes in an organisation’s people, processes and technology, not to mention the regulator changes, mean this role never stops. I then secured a Data Officer role in financial services where I was able to quickly implement data governance and begin working with the development team to use this to enable the development of a single client view.

What characteristics do you have that make you successful at Data Governance and why?

Patience is key. Knowing that data governance may not be the organisation’s number one priority means that you may have to wait your turn to be heard. Following on from that, having empathy for the management and executives in the organisation will help you appreciate their responsibilities and other commitments. This will ensure governance is not forced, but is ready when they are. Finally having sales skills will be essential. Governance is not for everyone and will not sell itself; regulation has helped make it a topic for organisations but it should be driven as an enabler and not just a tick in a box for the regulators.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

Without a doubt the best and most important book you should have by your desk is the DMBOK from DAMA. They recently published volume 2 and it is my “go to” resource for all data management queries

What is the biggest challenge you have ever faced in a Data Governance implementation?

When I was brought into one organization I joined because I was told of the huge investment they were making into data management only to soon find out that the executive sponsor was leaving and the governance function was now only a ‘nice to have’ rather than the key focus for the organisation.

Is there a company or industry you would particularly like to help implement Data Governance for and why?

I really enjoy helping others and so working for an organization that does this is important to me. I have had the privilege of working in a variety of organisations dealing with different types of data. By far the most rewarding was helping law enforcements and government agencies manage their data to ultimately help solve crimes. I also enjoy the enormous challenges of financial services and the huge importance of governance that the regulations place on these organisations.

What single piece of advice would you give someone just starting out in Data Governance?

I would suggest getting a mentor. Someone who has experience in governance and is able to help you.

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

Discovering that South Africa had 11 valid address types and 11 official languages… a data governance dream or nightmare :-)

Data Governance Interview - Jan Lenders

jan_lenders.jpg

Jan started his professional career as a bookseller and made the obvious switch to IT in 1986, working as an application programmer for a data centric application for the financial industry. From 1990 onwards, he focused on database design and later moved into data integration. In 2007, he decided to switch to a non-profit organisation and specialise in data integration. Since then, he has been working for a university of applied sciences in Arnhem, Netherlands.

In 2015, he obtained a MSc degree in IT at the University of Liverpool. For his dissertation project he researched the mutual effects of choices in Data Integration and Propagation areas on Master Data Management (MDM) and Data Governance (DG).

How long have you been working in Data Governance?

Although I do not have Data Governance as part of my official job title, I have helped to initiate DG initiatives and projects in our organisation. I learned from you that DG should not be an IT-led initiative. However, since our university did not have any official DG policies, IT as a provider of master data interfaces, was confronted with virtually all data issues. The only way to provide data with a reasonable level of quality was for IT to lead the initiative. As I have now learned from you, this is a textbook example of Mistake #1 of the 9 biggest DG mistakes, but we had to make a start.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

Throughout my career in IT I have been involved in managing data. While data has gained a dominant position in the IT landscape and data volumes are growing rapidly, I do not think DG has grown at the same pace and its importance is being underestimated. So as an IT guy, my involvement in DG arose from the lack of it in the organisational units where it should actually be allocated since we try to deliver qualitative good data.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

Any IT worker involved in data management should visit IRM UK’s Data Governance conference and MDM summit at least once. For me, the books written by Alex Berson and Larry Dubov, as well as David Loshin’s books have been very helpful to understand DG. But I honestly learned a lot from your report "The 9 biggest mistakes companies make when implementing data governance". We seemed to make most of these mistakes in our organisation, but we currently are working on solving the worst of them.

What is the biggest challenge you have ever faced in a Data Governance implementation?

That would be the underestimating of DG's importance by the business units and overestimating the quality of their data. For instance; we did not have one dedicated source system for organisational units. As a result, codes, abbreviations and names for units were kept and maintained in virtually each system without consensus. This had never been a real problem until data was being exchanged between these systems.

What single piece of advice would you give someone just starting out in Data Governance?

Spend half a day in each business unit with the people who are actually browsing, searching, entering and changing data to understand what is happening to their data and in particular why.

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

Data quality issues are likely to become painfully visible when data is exchanged between systems. To illustrate the understanding and importance of DG in general and data quality in particular I often make a comparison with traffic and traffic rules:

In France everyone drives on the right side of the road. Because there are agreements that are maintained, there are relatively few problems in traffic.

In England everyone drives on the left side of the road and again there are few problems.

The real problems only manifest themselves when cars leave from the mainland go to England or vice versa. This can only work if good agreements are defined and enforced.

 

 

Data Governance Interview - Stuart Squires

stuart_squires.jpg

For this interview I was able to ask Stuart Squires some questions. Stuart is the EMEA Managing Director of Comma Group a Data Management Consultancy specialising in the business, technical & change aspects of PIM, MDM and Data Governance initiatives.  When asked to describe himself he said "He doesn’t get bored when talking about data: yes, he is a bit weird."  I obviously don't think that it weird. In fact,  I think it's great that all the people at Comma Group are as passionate abut data as I am, as we are looking to work together this year.

How long have you been working in Data Governance?

It has been a gradual creeping interest from back in the day before it was called data governance. My career went on a data tangent in 2002 and back then all of your data issues could be solved by implementing ERP – or not as the case may be. As the industry matured from this “technology will solve everything” view, to a “we’ve spent loads of money and still don’t have a single view of X”, to an “ah, it is about the people isn’t it?” realization, I matured with it. 

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

A series of (un)fortunate events. When you spend many evenings and weekends getting legacy data into a fit state for SAP you either go insane or develop a passion for making sure that your work is not going to go to waste once the data is loaded. The seemingly endless hours with a stapler on the enter key waiting for data to load were spent contemplating strategies for keeping the data in check, especially in the face of continual business change (my first 7 years were spent at working at the same gas turbine plant, but for 5 different companies). I started sharing my thoughts (a little too passionately maybe) on a subject that no one was really thinking about or interested in. To find interested people I moved into consultancy, and then advisory and now to leading Comma Group’s EMEA business. 

What characteristics do you have that make you successful at Data Governance and why?

I am naturally bad at following process. This means that I have to consciously force myself to think about controls, measures and the impacts of not following process. These controls are different for every situation, be them work life, home life, playing sport, or anything: there is not one formula of governance for all situations. This is the same for different businesses, departments, people. I really enjoy guiding people on the journey of discovering the appropriate levels of governance and understanding the changes to the day to day life of individuals and groups that are required to realise the benefits. 

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

I’m a big fan of Bob Seiner. While his book “Non Invasive Data Governance” is not a detailed playbook of how to solve all the challenges you will face, it offers a set of simple principles that are undeniably critical to success. For detail, and for building muscle/developing a bad back, I often carry DAMA DMBOK 2 around with me. 

What is the biggest challenge you have ever faced in a Data Governance implementation?

The challenge we face time and time again is one of convincing people of the value of DG. It is very important to start proving value as soon as possible; a good place to start is to map business outcomes to data quality improvements, start measuring and producing pretty graphs. People like pictures. If you see improvements then you can start attributing it to the fact that you are now actively policing the area and behaviours are improving accordingly; if there is no improvement then at least you have facts that form the basis of a case for change, “Fix this and you will move the needle on your business performance”. 

Is there a company or industry you would particularly like to help implement Data Governance for and why?

Nicola, you are asking me to reduce my potential client list!

I am interested in how data is collected and used in education; especially with the devolution of control to academy trusts, free schools and the like. Education is the lifeblood of society and teachers spend far too much time juggling spreadsheets rather than teaching. There is an irony that the organisations that need the most help are the ones who can afford it the least; putting together a compelling, affordable “Data Governance in a Box” for education would be a very worthwhile endeavour.  

What single piece of advice would you give someone just starting out in Data Governance?

Learn about the technologies, but consider them only an enabler. 

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

Not humourous or challenging but surprisingly fulfilling. We are currently helping a client set up and embed a DG operation. We have been in attendance at the first 6 weeks of Data Governance Working Group meetings and the first Data Governance Council and Steering Group meetings. The DG Council meeting consisted of 2 hours of reviewing DG operation success criteria, agreeing ownership and approaches to DQ issues, signing off business term definitions, and making/escalating policy decisions. As we were leaving the following phrase was uttered, “I really enjoyed that.”, followed by a conversation that can be summarized as, “It is great to know that [steering group] will get sight of our views and be able to make decisions that will give us the authority to make a difference”. People enjoying data governance; perhaps I’m not the only weirdo!

Data Glossary or Data Dictionary?

definition-390785_1920.jpg

A lot of people get confused about what a Data Glossary is and how it is different from a Data Dictionary.  IT people are generally happy that they understand what a data dictionary is and in my experience some business people also understand what one is (and on the rare occasion may even want to refer to one). But there is often a lack of clarity over what a data glossary is.

The increasing focus on data governance and slowly maturing levels of data governance mean that the term data glossary is being increasingly heard.  But there is a great deal of confusion as the terms data dictionary and data glossary are often used interchangeably.  To add to the confusion, a data glossary is often called a business glossary, but for clarity, I will use only the term data glossary from this point onward.

The term data dictionary has been in mainstream data management speak for much longer than data glossary, so let’s start by looking at that first. According to the DAMA Dictionary of Data Management, a data dictionary is:

“A place where business and/or technical terms and definitions are stored. Typically, data dictionaries are designed to store a limited set of meta-data concentrating on the names and definitions relating to the physical data and related objects.“

Experienced Data Analysts and Project Managers understand that building a data dictionary during a project should be a key part of your requirements development efforts. Indeed my first experience with a Data Dictionary was when I was a Project Manager for data warehouse implementation, long before I had even heard of data governance!

While it doesn’t always happen, you should definitely take the time to identify and define all of the data that is being used as part of your project and a data dictionary should be created for every system that is built or implemented in your organization.  Sadly that is not always the case and even when created they are often forgotten. I have often come across instances where it was created as a project deliverable but not maintained, or even worse, lost/mislaid over time.

Data dictionaries should include a business definition of all terms and this should mean that business stakeholders have been involved in the creation of them.  However, because the people who are most likely to refer to a data dictionary are the IT and MI Team, they are often created without business input.  This is a pity as for the reasons I stated above, developing these as part of a requirements gathering process is an excellent way to clarify the business requirements and ensure that your new system meets them.

The first difference between the data dictionary and the data glossary is that whilst the data dictionary is seen very much as an IT-owned document, data glossaries should be created and maintained by the business.

Data glossaries are the place to document business terms along with their definitions. At this stage, I’m sure you’re wondering how that makes it different from a data dictionary and I’m going to reinforce that thought by saying that although I said above that they should be created and maintained by the business, a good way to start a data glossary is to use an existing data dictionary.  If you are lucky enough to have an existing (and up-to-date) data dictionary for your data warehouse, that would be an excellent place to start.

What makes a data glossary different is that although it can and often does contain details of the systems that data is held on (including tables and columns), the main focus of the content in the data glossary is information designed to improve business understanding and use of data.  To that end, while you may have multiple data dictionaries, you should have only one data glossary for your organization.

A data glossary is a key deliverable in a data governance initiative, and because of that, alongside the terms and definitions, you should also be capturing the data owner and data steward for each term.  As your organization becomes more mature you may also wish to consider including things like the data quality rules (i.e. what makes it good enough to use).  I have even come across some organizations that include a field in their data glossary that flags if there are any data quality issues that any potential users of that data would need to be wary of.

Some people will tell you a data glossary should be used to create a ‘common’ set of definitions.  Now I agree that would be sensible in a utopian data world, however, the vast majority of organizations are not yet mature enough in data governance terms to dive straight into this.  Instead, I encourage my clients to use the development of a data glossary to identify where there are a number of differing definitions for the same term and conversely where a number of different terms have the same definition.  Only then are you in a position to analyze these occurrences and agree to move to standard definitions.  This may, of course, involve a high degree of negotiation!

Be aware that forcing everyone to move to standard definitions is not always the right answer. If your investigations conclude that although they are named the same, there are valid business requirements for the different definitions, I would recommend that a sensible alternative would be to re-name terms to make it clear that they are not the same thing.  This prevents or solves one of the biggest causes of data quality issues that I have come across which is a lack of understanding of what the data means.  This can cause issues in two ways:

  • The data producers do not understand what a field should be used for and enter something similar, but slightly different.
  • Data consumers can often believe that data in one field represents something that it does not.

To sum up, data dictionaries are more technical in nature and tend to be system specific.   A data dictionary defines data elements, their meanings, and their allowable values.  A data glossary is enterprise-wide and should be created to improve business understanding of the data they produce and use.  A data dictionary should be a project deliverable for all system-related projects and a data glossary is a key part of a successful Data Governance framework.

Finally, if you are currently developing or are about to start to build a data glossary, the tips in this blog  will help you devise a successful approach.

 

Originally published on www.TDAN.com

When is a Data Quality issue not a Data Quality Issue? Part II

When helping my clients implement a Data Quality Issue Management Process I always come across resistance to implement the process. Even when it is up and running some stakeholders come up with interesting reasons why they don't need to use it. So I decided to research this in context of others similar experiences and I asked feedback from some data quality related LinkedIn groups on “when a data quality issue is not a data quality issue”.

I asked my assistant Liselle to help collate the responses as I have been very busy lately and her analysis and summary of the feedback was so good that I asked her to write the blog herself.  The reason I trusted her to do this, is that she is much more than my assistant, being a very experienced data professional herself (you can find more about Liselle on the Partners and Associates page).  So over to Liselle…

Thank you Nicola!

On analysing the responses, some of the key messages were:

  1. Providing a definition for a data quality issue
  2. Defining business boundaries
  3. Recognising the source of the data
  4. Using data quality tools

Let’s delve deeper into these topics

A data quality issue can be defined as a matter that causes the high quality of the data to be in dispute.

Data quality is concerned with the accuracy and completeness of the data among other key factors, and it needs to be fit for its intended uses.  So a data quality issue would be anything that compromises a business’ ability to effectively operate, plan or make decisions.

In providing this definition, it should start to become clear that the definition of a data quality issue does not vary.  It should be shared with all the organisation to support identifying issues, but it would not be specific to your organisation.  You may have to prioritise your issues, but they must all be identified as such.

What do you consider your boundaries to be?  Does it matter if you are obtaining data from an external source?  If you are using data within your systems, the generation of the data and the potential data quality issues are still data quality issues that need to be recorded and addressed.  The data owner will to be notified and the root cause of the issue determined and rectified.  If you want your organisation to make good decisions, operate effectively and be able to plan for the future, then the source of the data should not matter.  If you are using it and there is an issue, even If the data comes from outside your organisation you may not be able to resolve it but at the very least the consumers of that data need to be aware of its shortcomings so that they can allow for it.

In defining the business boundaries, the identification of your data owners and therefore the implementation of Data Governance, supports in how your data quality issues are processed and solved.  Implementing a Data Governance framework is vital to the long term sustainable improvement of data quality as it provides the mechanism to identify the root cause of issues and changes the culture to a more proactive management of data quality.

In recognising the source of the data, it may be determined that although you have a data owner who is internally responsible for the data, that accountability for its quality lies outside of the organisation.  This still needs to be addressed, and having relevant Data Governance Policies would support in determining what you your next steps should be.  It should still be recorded and addressed.

Although data quality tools are useful, if the right infrastructure is not in place, then the tools will not be able to effectively identify data quality issues to support finding solutions.  If you are not in the position to effectively use data quality tools, having Data Owners/Data Governance in place, the right decisions can be made about how best to handle a data quality issues and a more cohesive approach be made to wide spread data cleansing, for example by adding default values used instead.

So there is never a situation where a Data Quality Issue is not a Data Quality Issue.  It may not have been identified, but it can still be impacting the quality of your data.  Having a Data Governance Framework is one of the first steps to support in the resolution of issues.  You should also have a process to log, investigate and action data quality issues.  In situations where the Data Owner lies outside of the organisation, you need to be sure that consumers are aware of the shortcomings so that they can allow for it. 

And remember that when you convince your stakeholders to tell you their data quality issues, you will need to log them in a central place.  You can click here to download a free Data Quality Issue log template.