It’s been several weeks now since we ran our first Show & Tell for the Business Glossaries GDAC subgroup and it took a bit of work to get there.
As one of four working ‘subgroups’ we’d been tasked with defining “Business Glossary” in order to help establish a common understanding of what we, as a community of data management professionals, are actually all talking about.
All we had to do was to “Define ‘Business Glossary’”.
Easy-peasy!
Surely this has already been done? Surely every data management professional pretty much agrees on what one of these is? Surely this would take mere seconds for a team of five data architects?
As it turns out it wasn’t so easy. Who and where do we turn to? One or two well-known books about data management? The inter-web of confusion? There are no stone tablets that we could find that would tell us what we needed to know.
You’re just making (fluffy) stuff up!
So our approach was to research (yes, probably that popular search engine came into it), grab a handful of ideas from a mix of reputable and as well as random sources, then hold them up next to each other and try to spot what seemed to overlap and what seemed interesting and useful.
There were “Business Glossaries” with 4 sides, some with 3, some with 9, blue ones, pink ones, and transparent ones. OK, not literally of course, but this is an abstract, conceptual, fuzzy thing.
Then came the job of boiling these sources of information down into a single block of definition text. We took the well-trodden path of describing terms (words) with other (and more) words. The odd graphic and a few more examples might be helpful at some future stage too.
We reviewed and tweaked these within our group until we settled on something that ‘felt’ coherent and covered the ‘features’ that we felt were important enough to include.
So, we’ve had to create, or perhaps build, something ourselves using clues, fragments and mere scraps of information. Reinventing the wheel would have been easy - it’s got 4 sides, right?
Hang on though, what about THAT?
Although we’d been tasked only with defining “Business Glossary” we’d ended up pulling together several definitions. Members of the group were aware of other “things” we’d come across that felt somewhat similar: “taxonomies”, “vocabularies” and “dictionaries”. We had no (definitive ?) definitions of these things. But hang on, what exactly IS an “ontology”? It seems popular these days!
Oh, and don’t forget “lexicon”!
So, in order to define what a “business glossary” IS we felt it prudent to try to define what it is not (within sane limits; we were pretty sure is wasn’t what most people call a “carrot”).
We did ultimately dismiss the need to define a couple of generic terms too (“vocabulary” and “lexicon”) less we started to reinvent the Oxford English Dictionary.
You can read some of the definitions we've come up with in this document we've put together.
From which dust clouds did these things form?
Our full list of sources of information that we considered is listed here.
What? | From Where? |
Business Glossary | DAMA DMBoK version 2.
DATAVERSITY (Michelle Knight article) DATAVERSITY (Michelle Knight article) |
Taxonomy and variants (Flat, Hierarchical, Network, Facet) | DAMA DMBoK version 2.
Marketingland (Shari Thurow article) |
Data Dictionary | DAMA DMBoK version 2. |
Ontology | DAMA DMBoK version 2. |
Data Catalogue | DATAVERSITY (Michelle Knight Article) |
Please feel free to inspect and consider these. We’re very keen to have our initial definitions prodded, poked and stretched to see where they break.
But what else do we need to know?
We thought it import to, at least, declare our sources of information. But thinking ahead (at least a first stab) about how we might work with the other subgroups we’ve proposed some additional attributes of our terms. Are these metadata? meta-metadata? meta-definitions? Time will tell, hopefully. A complete term definition then is assembled from the following attributes:
- Name - The name of the term; what we’re calling this concept.
- Definition - More/other words that expand the meaning of the term.
- Acronyms / Abbreviations - Acronyms and/or abbreviated versions of the term name.
- Synonyms - Other names by which the term is known.
- Initial Term Identifier (Party, Role or Other Source) - The source, within or external to GDAC, of our definition.
- Last Updated Date - The data that we have lasted updated our term.
- Term Steward (Party, Role or Position) - A custodian/steward (TBC) for the term.
- Taxonomy Association(s) - Please see our definition of Taxonomy! ?
- Common Misunderstandings - Any frequent mistaken identity and confusion around the terms.
- Lineage (Or perhaps “Provenance” – see our Show and Tell Q&A) - From where the term has been defined or how it’s evolved.
To the Show and Tell!
Our first Show and Tell went ahead on 29th May and we were hopeful of getting a decent attendance.
Apart from creating awareness by showing off the work we’d done, and how we’d done it we also really wanted to spark further engagement with the community on the work we’d done:
- What do you think of these definitions?
- Are they suitable, yet, for GDAC (and perhaps beyond…Government Departments)?
- The support we’re going to need from other GDAC subgroups.
…the work we’re intending to move onto next:
- Exploring SKOS-based tooling and building on the work of the Scottish Government.
- Creating a working example (as a home for our definitions as well as something concrete to prod and poke).
- Creating a business glossary for GDAC.
…but also, just getting more joined-up with how we’re all thinking about business glossaries:
- Putting our contact details ‘out there’ and inviting others to join up with us.
No subgroup is an island
We will certainly need the help of the other GDAC subgroups to create something robust in terms of how the contents of a GDAC business glossary evolves.
How do these concepts evolve or get refined?
How should these concepts be “governed” (whatever that means) for a community of data management professionals? For an entire government? Should they be governed at all? Command and Control, crowd-sourcing, or something in between?
What exactly does a “good” glossary term should look like? Do we need a “standard” for them?
Is the content of a business glossary “metadata”? Only if it helps to describe some physical data asset? Hmmm… remind us, what is “metadata”?
You can find more about the subgroups in our older blog post "What has been going on in Government Data Architecture Community Group?"
Thank you
We were very happy with how our first Show and Tell went. Thank you to all of you who made it to the event!
We managed to say everything we wanted to and we had a great level of engagement through the Sli.do platform – thanks to all who submitted questions and observations.
We've answered those questions in our Q&A document which you can read now.
And if you have any more, we’d be very pleased to hear from you.
We’ve made some new contacts within GDAC, and beyond, are continuing to work across organisational boundaries.
We’ve even recruited a new subgroup member!
Our next mission to get that working example up and running and we look forward to reporting back to you soon on that.
Conclusion
We’ve taken just the first small step. We’ve moved from a blank canvas to a first, rough, sketch and we’re looking forward to painting in more of the details with the community.
We see definitions of concepts as fundamental to effective communication as well as to the effective production, use and management of data. Perhaps more accessible and maintainable than a conceptual data model, we think that the Business Glossary could perhaps be the tool that enables this, and we hope to help you find out soon with a working example.
So we hope we’ve piqued interest, lit up a few lightbulbs, and started a discussion around getting more joined up - not just in how we define the concepts that data management professionals, with a common purpose, are interested in - but also how we can perhaps bring this approach to everything that we do within our respective organisations to improve our effectiveness.
If you’d like to know more about what we’re doing or to get involved, then please reach out to us via the GDAC email inbox (Data.Architecture@ons.gov.uk).
NB: If you didn’t manage to catch the Show and Tell event, then we’ll be making the recording available soon – so stay tuned to this channel!
4 comments
Comment by Rob posted on
I dispute the statement in the linked definitions document "Objects are placed into one and only one classification category (i.e. related to a term)". This would depend on what the use of that taxonomy is. For example, this is true for a scientific classification of living things, or the periodic table, etc. However if the taxonomy is a user-oriented mechanism to help people find content or data, this rule can have a negative impact. For example, a blackbird is definitely uniquely one end of a hierarchy that includes animals, birds, and songbirds. However it is also a "black-coloured animal" (alongside panthers etc), an "animal featured in a Beatles song" (alongside walruses, etc), Of course while it is a valid aspiration from a data-management perspective to have clearly separated terms in a taxonomy, it doesn't mean that is the way the human mind works. While my example is a bit silly, the point is that different users have different contexts for what they are trying to find, and only allowing an object to be tagged against one node of a taxonomy can result in navigational dead-ends from a user's perspective.
Comment by mattdavies posted on
Thanks for getting in touch, that’s a really good point you’ve raised and something that we discussed during the development of these terms.
Our logic was that content can be attached to multiple taxonomies and so things like different hierarchies (or sets of terms) are potentially treated as different taxonomies in our definitions.
Please get in touch with us directly (mark.matten@nao.org.uk) if you’d like to work with the subgroup improve our definitions.
Thanks for the question AND for getting blackbird stuck in my head this afternoon!
Comment by Rob posted on
Thanks Mark. When I read on to the definitions for polyhierarchical, network, facet taxonomies I think you have it covered. I just thought that in that context, the top level definition of "taxonomy" should be a little looser and more generic. I may well get involved soon but I won't explain why here!
Comment by Olgiv posted on
I am grateful to you mark. Great blog, would love to share this.