Why data curiosity is more important than ever
Nicky Pantland, Data Analyst at PBT Group
As we navigate the modern data-driven era, the concept of ‘data curiosity’ will become an increasingly integral facet of our lives. Even though this term is often narrowly defined as mere interest in the end-results, such as numbers or visualisations, I believe a more holistic approach which encompasses the entire data lifecycle is vital.
Today, there are many data tools functioning in “black box” environments. You feed them data with the tool producing the desired output. While this sounds convenient and artificial intelligence is simplifying data analysis, there is a downside. It has become easy for users to overlook the essential task of understanding the data they input.
The age-old wisdom of “garbage in, garbage out” applies here. The emphasis, more than ever, is on the quality of inputs. With today’s focus on brilliant outputs, we are at risk of losing sight of the origins of our data, understanding its formation, and consequently, the very essence of data literacy.
But why is this origin story so important?
1. Data lineage (the ‘where’): Knowing where your data is sourced from can identify potential systemic issues, limitations, or discrepancies. This is especially the case when data is repurposed from its primary function making it important to recognise any gaps or inconsistencies that could affect subsequent processes.
2. The creation process (the ‘how’): Is the data raw, or has it undergone any transformations? Understanding the processes, definitions, and rules that moulded your data can help align it better with your objectives. If these do not match your expectations, you must decide whether to redefine or recalculate to fit your context.
3. Purpose (the ‘why’): Recognising the original intent behind data creation ensures its appropriate usage. Without grasping the ‘why’ of data lineage, there is a significant risk of misuse, leading to skewed outputs.
A broader perspective on data curiosity can be transformative. By investigating the origins of data, you can obtain insights that enhance your data literacy and consequently refine its application through its entire lifecycle. Large organisations or data centres that rely on complex data layers can find benefit from such an approach.
The modern era demands us to be more than just passive consumers of data. By seeking the ‘where,’ ‘how,’ and ‘why’ of our data, we can ensure more accurate results and reduce future discrepancies. Embracing a comprehensive approach to data curiosity is not just beneficial, it has become an essential skill to have.