Data Glossary or Data Dictionary?

definition-390785_1920.jpg

A lot of people get confused about what a Data Glossary is and how it is different from a Data Dictionary.  IT people are generally happy that they understand what a data dictionary is and in my experience some business people also understand what one is (and on the rare occasion may even want to refer to one). But there is often a lack of clarity over what a data glossary is.

The increasing focus on data governance and slowly maturing levels of data governance mean that the term data glossary is being increasingly heard.  But there is a great deal of confusion as the terms data dictionary and data glossary are often used interchangeably.  To add to the confusion, a data glossary is often called a business glossary, but for clarity, I will use only the term data glossary from this point onward.

The term data dictionary has been in mainstream data management speak for much longer than data glossary, so let’s start by looking at that first. According to the DAMA Dictionary of Data Management, a data dictionary is:

“A place where business and/or technical terms and definitions are stored. Typically, data dictionaries are designed to store a limited set of meta-data concentrating on the names and definitions relating to the physical data and related objects.“

Experienced Data Analysts and Project Managers understand that building a data dictionary during a project should be a key part of your requirements development efforts. Indeed my first experience with a Data Dictionary was when I was a Project Manager for data warehouse implementation, long before I had even heard of data governance!

While it doesn’t always happen, you should definitely take the time to identify and define all of the data that is being used as part of your project and a data dictionary should be created for every system that is built or implemented in your organization.  Sadly that is not always the case and even when created they are often forgotten. I have often come across instances where it was created as a project deliverable but not maintained, or even worse, lost/mislaid over time.

Data dictionaries should include a business definition of all terms and this should mean that business stakeholders have been involved in the creation of them.  However, because the people who are most likely to refer to a data dictionary are the IT and MI Team, they are often created without business input.  This is a pity as for the reasons I stated above, developing these as part of a requirements gathering process is an excellent way to clarify the business requirements and ensure that your new system meets them.

The first difference between the data dictionary and the data glossary is that whilst the data dictionary is seen very much as an IT-owned document, data glossaries should be created and maintained by the business.

Data glossaries are the place to document business terms along with their definitions. At this stage, I’m sure you’re wondering how that makes it different from a data dictionary and I’m going to reinforce that thought by saying that although I said above that they should be created and maintained by the business, a good way to start a data glossary is to use an existing data dictionary.  If you are lucky enough to have an existing (and up-to-date) data dictionary for your data warehouse, that would be an excellent place to start.

What makes a data glossary different is that although it can and often does contain details of the systems that data is held on (including tables and columns), the main focus of the content in the data glossary is information designed to improve business understanding and use of data.  To that end, while you may have multiple data dictionaries, you should have only one data glossary for your organization.

A data glossary is a key deliverable in a data governance initiative, and because of that, alongside the terms and definitions, you should also be capturing the data owner and data steward for each term.  As your organization becomes more mature you may also wish to consider including things like the data quality rules (i.e. what makes it good enough to use).  I have even come across some organizations that include a field in their data glossary that flags if there are any data quality issues that any potential users of that data would need to be wary of.

Some people will tell you a data glossary should be used to create a ‘common’ set of definitions.  Now I agree that would be sensible in a utopian data world, however, the vast majority of organizations are not yet mature enough in data governance terms to dive straight into this.  Instead, I encourage my clients to use the development of a data glossary to identify where there are a number of differing definitions for the same term and conversely where a number of different terms have the same definition.  Only then are you in a position to analyze these occurrences and agree to move to standard definitions.  This may, of course, involve a high degree of negotiation!

Be aware that forcing everyone to move to standard definitions is not always the right answer. If your investigations conclude that although they are named the same, there are valid business requirements for the different definitions, I would recommend that a sensible alternative would be to re-name terms to make it clear that they are not the same thing.  This prevents or solves one of the biggest causes of data quality issues that I have come across which is a lack of understanding of what the data means.  This can cause issues in two ways:

  • The data producers do not understand what a field should be used for and enter something similar, but slightly different.
  • Data consumers can often believe that data in one field represents something that it does not.

To sum up, data dictionaries are more technical in nature and tend to be system specific.   A data dictionary defines data elements, their meanings, and their allowable values.  A data glossary is enterprise-wide and should be created to improve business understanding of the data they produce and use.  A data dictionary should be a project deliverable for all system-related projects and a data glossary is a key part of a successful Data Governance framework.

Finally, if you are currently developing or are about to start to build a data glossary, the tips in this blog  will help you devise a successful approach.

 

Originally published on www.TDAN.com

When is a Data Quality issue not a Data Quality Issue? Part II

When helping my clients implement a Data Quality Issue Management Process I always come across resistance to implement the process. Even when it is up and running some stakeholders come up with interesting reasons why they don't need to use it. So I decided to research this in context of others similar experiences and I asked feedback from some data quality related LinkedIn groups on “when a data quality issue is not a data quality issue”.

I asked my assistant Liselle to help collate the responses as I have been very busy lately and her analysis and summary of the feedback was so good that I asked her to write the blog herself.  The reason I trusted her to do this, is that she is much more than my assistant, being a very experienced data professional herself (you can find more about Liselle on the Partners and Associates page).  So over to Liselle…

Thank you Nicola!

On analysing the responses, some of the key messages were:

  1. Providing a definition for a data quality issue
  2. Defining business boundaries
  3. Recognising the source of the data
  4. Using data quality tools

Let’s delve deeper into these topics

A data quality issue can be defined as a matter that causes the high quality of the data to be in dispute.

Data quality is concerned with the accuracy and completeness of the data among other key factors, and it needs to be fit for its intended uses.  So a data quality issue would be anything that compromises a business’ ability to effectively operate, plan or make decisions.

In providing this definition, it should start to become clear that the definition of a data quality issue does not vary.  It should be shared with all the organisation to support identifying issues, but it would not be specific to your organisation.  You may have to prioritise your issues, but they must all be identified as such.

What do you consider your boundaries to be?  Does it matter if you are obtaining data from an external source?  If you are using data within your systems, the generation of the data and the potential data quality issues are still data quality issues that need to be recorded and addressed.  The data owner will to be notified and the root cause of the issue determined and rectified.  If you want your organisation to make good decisions, operate effectively and be able to plan for the future, then the source of the data should not matter.  If you are using it and there is an issue, even If the data comes from outside your organisation you may not be able to resolve it but at the very least the consumers of that data need to be aware of its shortcomings so that they can allow for it.

In defining the business boundaries, the identification of your data owners and therefore the implementation of Data Governance, supports in how your data quality issues are processed and solved.  Implementing a Data Governance framework is vital to the long term sustainable improvement of data quality as it provides the mechanism to identify the root cause of issues and changes the culture to a more proactive management of data quality.

In recognising the source of the data, it may be determined that although you have a data owner who is internally responsible for the data, that accountability for its quality lies outside of the organisation.  This still needs to be addressed, and having relevant Data Governance Policies would support in determining what you your next steps should be.  It should still be recorded and addressed.

Although data quality tools are useful, if the right infrastructure is not in place, then the tools will not be able to effectively identify data quality issues to support finding solutions.  If you are not in the position to effectively use data quality tools, having Data Owners/Data Governance in place, the right decisions can be made about how best to handle a data quality issues and a more cohesive approach be made to wide spread data cleansing, for example by adding default values used instead.

So there is never a situation where a Data Quality Issue is not a Data Quality Issue.  It may not have been identified, but it can still be impacting the quality of your data.  Having a Data Governance Framework is one of the first steps to support in the resolution of issues.  You should also have a process to log, investigate and action data quality issues.  In situations where the Data Owner lies outside of the organisation, you need to be sure that consumers are aware of the shortcomings so that they can allow for it. 

And remember that when you convince your stakeholders to tell you their data quality issues, you will need to log them in a central place.  You can click here to download a free Data Quality Issue log template.

Data Governance Interview – Rutendo Urenje

rutendo_urenje

Rutendo Urenje is an Experienced Data Governance Support Officer, with a demonstrated history of working in the international affairs industry. She is skilled in Nonprofit Organizations, Fundraising, Leadership, Project Management, and Customer Service. She is a strong International Law professional who graduated from University of Lund.

How long have you been working in Data Governance?

I have been working in Data Governance for a year now.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

I would say I didn’t choose Data Governance but Data Governance chose me and I am grateful it did. I first came across Data governance through research after I had been tasked with coming up with a proposal of what the organization I work for needed to improve its data management maturity level.  After some research, I realized that the private sector was buzzing with Data Governance but the Inter-Governmental Organizations had not quite gotten there yet. I then proposed for us to consider Data Governance as an option for managing our data better.

What characteristics do you have that make you successful at Data Governance and why?

I am blessed with the ability to come up with innovative ways to problem solving. My new philosophy is, “Innovation is the linchpin of result”. I don’t back down until we have a solution that makes everyone feel as though they have won. I also make sure that in driving Data Governance forward, everyone knows and feels that they are an important part of the team and they are valued.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

When I started with Data Governance, I began at the Data Governance Institute website, I spent all my days there. Then I got into John Ladley’s Data Governance and Sunil Soares’ Big Data Governance.

What is the biggest challenge you have ever faced in your Data Governance implementation so far?

My biggest challenge has been getting everyone to make Data Governance an Institutional Priority. Also getting people to agree on what are the data governance imperatives we cannot do without.

What single piece of advice would you give someone just starting out in Data Governance?

I would say, learn your stakeholders, know what their interests are and help them understand why Data Governance is uniquely important to their business area.

Finally, I wondered if you could share a memorable data governance experience (either humorous or challenging)?

We were working on some Data Governance Guidelines and we had scheduled some meetings with different stakeholders. One meeting that stood out was with a stakeholder who had reviewed a section of our guidelines. When we started the meeting, I noticed that he was not particularly friendly or chatty and seemed irritated by the whole Data Governance Framework. During the meeting, he was hostile and unyielding. Our meeting ended with us not really agreeing with each other and simply just frustrated. However, while leaving the meeting room, I then noticed the draft guidelines in his hands and noticed comments written in red and realized that he had a problem with the diagrams we had included to represent my role as a support officer versus his role as a Data manager. Consequently, we went back to the drawing board and made sure that the Data Governance Support position was represented as a supportive structure to the already working system. This helped a lot in getting buy in from other stakeholders.

When is a Data Quality Issue not a Data Quality Issue?

When helping my clients implement a Data Quality Issue Management Process I always come across resistance to implement the process. Even when it is up and running some stakeholders come up with interesting reasons why they don't need to use it.

The most obvious one of course being that all their data is fine so there is nothing to report. Now I am optimist in most things but experience has taught me to be a little more realistic when discussing organisation's data and although in such circumstances I really want to believe them, experience tells me that that is unlikely to be the case.

An amusing situation I have come across is that the data was wrong but they don't know what was exactly wrong with the data, so they won't report it as an issue! Others include things like: "but it's a process that needs to be fixed not the data, so I won't report it as a data quality issue" and "it's not that there's a problem with it, it's just not there"!  There have also been numerous occasions when I have been told "but we have a work around for that, so don't need to report it".

Now dealing with stakeholders and the obstacles they throw up when you are trying to implement a data quality issue resolution process is exactly the kind of thing I like to write blogs on.  I'd really appreciate your input in sharing similar such comments that you have experienced, so that I can identify the common themes and then share some tips and advice on how to deal with the most common push backs!

In the meantime you can read my free report which reveals why companies struggle to successfully implement data governance.  

Discover how to quickly get you data governance initiative on track by downloading this free report

7 Tips for Running a Successful Data Governance Forum

swan_picture.png

At Data Owner Forums, Data Governance Committees or whatever you call yours, you must make everything appear effortless and elegant giving the impression that all is progressing swimmingly and easily.  But beware, it may well be the former but never underestimate the amount of effort required in order to create the impression of the latter.

The elegance of a swan comes to mind as the best analogy for the situation, especially early on in your data governance initiative when a great deal of frantic paddling under the water is required (but to be honest a fair amount of effort is required on an ongoing basis).

As the Data Governance Committee is often set up fairly early on a Data Governance initiative and may well be involved in steering the implementation of your data governance framework, it is vital that it functions successfully, so I thought I would share with you my top tips for running a successful forum:

Timing is Tantamount

Do not set it up until it has a function to serve. If it is the only part of your DG Framework which you have implemented, it is doomed to fail as it will have nothing to do. Everyone these days spends too much time in meetings, no one will thank you (and to be honest they probably won't agree to attend) if it doesn't yet have anything to do - it may be that you want it to steer the DG initiative but make that clear and include in its terms of reference how and when it will evolve into a business as usual body.

Casting the Chair

The chair of your committee needs to be a senior supporter of your data governance initiative, ideally your senior sponsor, to give it credibility and influence.  Do not think that as the Data Governance Manager you should chair this meeting - your role is to facilitate, provide expert advice and to ensure that all goes swimmingly, but never to chair the meeting.

Action Packed Agendas

Do not hold a meeting if there is not enough content on the agenda.  If there is nothing to discuss, debate or approve, what are you holding the meeting for? Straightforward updates on progress can always be circulated by email.  It is better to cancel or reschedule a meeting than to insist on holding it with a mediocre agenda - this will result in stakeholder disengagement.

Particular Preparation

Prepare, Prepare, Prepare! It sounds obvious, but is often overlooked.  The attendees of this meeting will be senior people who are pressed for time. Make sure that none of the time they have allocated to your meeting gets wasted, make sure everything is planned and organised and don't overlook the basic admin tasks such as booking meeting rooms, arranging conference calls and printing papers.  You do not want such minor issues to distract focus and eat into to the valuable time you have available for important data decisions to be made.

Perfect Planning

A continuation of the previous tip I'll admit, but you really need to plan well for the meetings.  Agree the agenda well in advance of the meeting with the Chair and give plenty of notice to those required to submit papers or present at the meeting (and it doesn’t do any harm to remind them a week or so ahead of when you need their submissions.)

Send the meeting pack out in advance but not too soon or too late. Too soon and it will get lost, too late and your attendees won't have time to read and digest the documents before the meeting.  I have found that two or three working days in advance works well, but that will of course depend on the culture and working practices of your organisation.

No Surprises

Never ask for big decisions to be made without giving plenty of warning. Share significant proposals with key members (ideally all of them, if you have the time and calendars allow) so that they are on board ahead of the meeting.  This is particularly important before the inaugural meeting of your committee.  Simply inviting them and hoping that they will show up on the day just will not work. Before the first meeting you have to make the time and effort to meet with each of your members individually, to gain their support for the forum and to give them each an opportunity away from their peers to provide input into the terms of reference for the committee.

 Stakeholder Support

You may well be presenting as Data Governance Manager and you need to ensure that you are well prepared for that, but do not forget your other stakeholders. Just because you are not presenting a topic doesn't mean that you can leave those agenda items to chance.  Support your presenters and stakeholders, both in advance of and at the meeting, so that their proposals and presentations achieve the desired outcome.

 

This list is certainly not exhaustive but by now you may well be exhausted from all that paddling!  But I hope that this has given you some insight into how to ensure that your Data Governance Council is successful and please do share your own tips for success by adding your comments to this blog.

 

My free report reveals why companies struggle to successfully implement data governance. Discover how to quickly get you data governance initiative on track by downloading this free report

Data Governance Interview - Robin Stielau

 

Robin Stielau is the IT Director with Brady Corporation, Milwaukee Wisconsin. Robin has  been with Brady for 33 years and is currently accountable for the enterprise global technologies that enable Sales, Marketing and Customer Services, global PIM (Product Information Management) capabilities and global master data governance.

How long have you been working in Data Governance?

I have been directly accountable for master data governance at Brady since 2015. 

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

What motivated me to accept responsibility for master data governance was the obvious business need and my passion for improved data. I had a very clear vision of how to go about getting it.  Master data governance was the responsibility of a central team outside of IT. Changing circumstances at Brady presented me with the opportunity to get involved and I requested the data governance function be moved into IT under my direction after which it was.

What characteristics do you have that make you successful at Data Governance and why?

I call it a lack of ego. That doesn’t mean that I don’t have one, but one of the most important characteristics of success in governing data is recognizing and staying true to the fact that the data governance team does not own the data or decisions surrounding it. The reason we exist is to facilitate and hold those who do accountable. We are servants to the methodology. Facilitation skills are critical. Bringing together owners of various data domains with data consumers on a regular cadence to address data requirements, what healthy data looks like and how to measure it, improvement roadmaps to correct or complete data and data processes and to eventually provide visibility to improved business outcomes based on more accurate, complete and timely data.  Passion, commitment and grit as governing data is never ending.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

  • Nicola Askham is a great resource. In a world of information overload, I found her communication on the topic to be most helpful in understanding what data governance was and wasn’t; her communication is clear and straightforward.
  • Talking with as many peers and experts as possible at conferences and other business meetings
  • Blogs, whitepapers.
  • Stibo was a great resource to me (our PIM is a Stibo product, STEP)

What is the biggest challenge you have ever faced in a Data Governance implementation?

My biggest challenge was to change the perception in Brady of what master data governance is and is not. Previous data governance management was more involved in making decisions regarding master data as well as changing data the governance team felt needed to change. This was damaging to the program as business data owners were not making decisions nor were they always in agreement with the changes. “Under new management” – we changed this.

We occasionally have challenges with overloaded data owners or wrong people identified as data owners. Data owners can’t be by name only or any person that has some capacity, they have to have some skin in the game and be committed. If not, we force a change.

What single piece of advice would you give someone just starting out in Data Governance?

It’s a long term journey, not a race. Having strong executive sponsorship is definitely helpful.

My most memorable experience with data governance was the “ah-ha” moment one of our key business leaders had when shown the negative customer experience on one of our websites resulting from bad product.  In this case it was a size attribute for one of our key products; a set of signs.  The size had been entered on signs within the set in multiple (non-standard) ways; 7 x 10, 7 in x 10in, 7 in L x 10 in H, etc. Unfortunately all of the non-standard sizes appeared in the left hand navigation on the site. It presented a horrible experience for our customers and was the direct result of non-standard master data.

We now had a believer in the value of good master data. I use this example often when explaining the value of standard, accurate master data.

Having read my interview with Robin you can also read my free report which reveals why companies struggle to successfully implement data governance.  

Discover how to quickly get you data governance initiative on track by downloading this free report

Data Governance Interview - Charles Joseph of Datazed

 

Charles Joseph has recently set up as a data governance consultant, Datazed, following fifteen years working in consulting, compliance and insurance.  Charles sees data governance as the foundation of being able to make good data-led decisions.

How did you start working in Data Governance?

 It was a journey that looks logical in hindsight, but wasn’t at the time!  I started as a graduate consultant at Deloitte, got involved in a project to help the internal compliance team roll out a suite of new systems, and eventually focused on the data that supported those systems.  By this point, my three week assignment had gone past nine years!

I was then headhunted for a role at Beazley, which is a specialist insurer in the Lloyd’s of London market, on their data workstream for Solvency II.  That was in 2011.  At the end of 2016, I decided that I wanted to work for myself and incorporated as Datazed Ltd.

What are your plans for Datazed?

Having spent my working career at just two organisations, I’m really excited to be getting out into the wider world and working with lots of different people to help them with their data challenges and opportunities.  This could range from a review of their existing approaches, all the way through to creating a data strategy and managing its execution. Anyone who could be interested in this should contact me via my website or LinkedIn.

I’m also partnering with some software vendors in the data space.  What’s really great is that these partnerships are not exclusive, as there is an appreciation that the solution has to be right for the client.

You can find out more about Datazed at www.datazed.co.uk

Are there any particular resources that you found useful support when you were starting out?

When researching for the Beazley role, I realised that I had done a lot of data quality work, but without the structure around it.  Ken O’Connor’s blog and Dylan Jones’ dataqualitypro.com were particularly helpful. 

What is the biggest Data Governance challenge you have faced so far?

Once we had delivered on the Solvency II work, we naively felt that we could strip off the gold plating of S2 and deliver some really useful data processes to the rest of the business.  The almost total lack of buy-in from those other departments made me realise that we had to work much harder on selling these ideas and benefits.

The other memorable challenge was the first time I presented to an audience on the subject of data governance.  It was nerve-wracking to expose my own opinions and ideas to people working in the same industry – but it’s an incredibly good way of making sure that you fully understand those ideas and that they make sense.

What advice would you give someone just starting out in Data Governance?

Pick your organisation very carefully.  There are many organisations with data teams, but few of these roles are similar when you actually see the job specification. 

Look for a manager or leader who, when you meet them, is passionate about what data can do.  Then be that passionate yourself.

I learned a lot about data governance from the material shared by experts in the field – I mentioned Ken and Dylan’s websites above, but there also plenty of groups on LinkedIn in our area, some of which have some great insights and discussions.  This inspired me to go on and do a few Pulse articles myself.

What do you wish you had known or done differently when you were just starting out in Data Governance?

Periodically, there will be doors that can be pushed open to get good data practices in place.  This might be regulatory (e.g. Solvency II, GDPR) or project based (e.g. a Finance Transformation Programme).  These opportunities do not come along often, so exploit them as much as you can.

Discover how to quickly get your data governance initiative on track by downloading this free report

 

Data Governance Interview - Father Christmas

In my final blog post of the year, I have been extremely lucky to do a Data Governance interview with a very special person.  Known as Father Christmas to my English readers and Santa Claus to my international audience, this person is the personification of Christmas.   Although he may not be your “typical” Data Governance person, it does play a crucial part in the success of his endeavours. Read on to find out how…

http://www.123rf.com/profile_hasloo'>hasloo / 123RF Stock Photo</a

How long have you been working in Data Governance?

Well to be honest, it’s not what I consider my main role, but it has formed a vital part of my activities for as long as I can remember.

Some people view Data Governance as an unusual career choice, would you mind sharing how you got into this area of work?

To be fair I probably didn’t give it any thought for the first few hundred years, but as the population and the traditions of Christmas grew, it became obvious that I had to do something about my record keeping, so that I could continue to deliver the excellent service that children around the world have come to expect. So when I started noticing mistakes about who was on the Naughty and Nice Lists, I knew we had to do something to manage our data better.

I suppose you must create a lot of data?

I certainly do, there’s the main records like childrens names, their location, their behavior records and also their taste in toys.  History is very important as well – I don’t want to give a child the same present this year as last!

In the early days that was all we focused on, but then we realized lots of things need good data.  For example maintenance records and flying hours for the sleigh.  One year we found out to our cost what happens if you don’t replace key mechanical parts before they wear out!

Location data is also key in planning the optimal route for Christmas Eve.  It used to keep a whole team of elves occupied all year updating and maintaining our geodata, but these days we source that data externally and that team have been able to return to their preferred toy making roles.

What characteristics do you have that make you successful at Data Governance and why?

Now that would have to be my perfectionism and need for everything to be delivered seamlessly, also to exceed my customers expectations every single time.  That makes me focus tirelessly on getting the data right so that everything happens just like magic on that one all important night of the year.

Are there any particular books or resources that you would recommend as useful support for those starting out in Data Governance?

To be honest there wasn’t anything available as I was working out how to manage my data better!  I think the first article I ever saw online, that mentioned Data Governance, was a scientific study of penguins in Antarctica.  That was about 15 years ago and it was not really of interest to me as I’m based in the North Pole, not the South and had pretty much got Data Governance sussed by then!*

What was the biggest challenge you faced implementing Data Governance?

That would be getting the elves to engage in the process in the first place – they really didn’t get what data had to do with making and delivering toys!  It took me several attempts before I worked out that I had to explain Data Governance in terms of what it meant for them. 

So when I stopped talking about the need to manage our data better in vague terms and started talking about making sure that the rights toys were made and delivered on time, to the right children, it started making sense to them and the initiative really started to be accepted and make a difference. 

At that point it changed from mutterings about “he’s lost the plot” to elves clamoring to be Data Owners and Data Stewards – it was an amazing and heartwarming time!

What single piece of advice would you give someone just starting out in Data Governance?

Don’t expect miracles, it is going to take a long time and be hard work but believe me, it will be worth it in the end!

 

So think on for next year and then yule know what miracles Data Governance can truly help you achieve.

Yo Ho Ho!

 

*Interviewers note – If you think Father Christmas is making this up, I can assure you that that scientific study does indeed exist, I recall it being the first mention of Data Governance that I too ever found online!