In 1989 Tim Berners-Lee created the hypertext protocol. This technology made it possible for the internet to host – for the first time – websites. In effect, he democratized the internet, bringing the everyday Joe, Nneka and Song into the fold. Before this, the world wide web didn’t exist, and the internet was only accessible to a handful of commercial, government and research concerns.
Tech has evolved remarkably since those early days. The convergence of the web, cloud, database technology and the internet of things (IoT) has created an environment where data is ubiquitous, thrusting us into an era of Big Data. This environment has given rise to Business Intelligence (BI) and data science; terms that speak to the discovery of insights by analyzing data, to inform better decisions.
“Better” is the operative word here. It gives a crucial hint at where Big data can best serve us. Whether the gains are marginal or monumental, data can help improve what’s already being done. But we must ask the right questions. Hence the BI sector is best placed to help us solve what engineers and mathematicians might refer to as optimization problems. These tend to have to do with resource allocation, evaluating different options and predictive analytics. The goal is to optimize at each point of the value chain, one problem at a time. This is important to appreciate. The genesis of every solution lies in an accurate definition of the problem. This seemingly simple task requires clarity (both of vision and thought) to locate and define the problems data can help us resolve.
The Challenge
Reducing the adoption of data-based decision-making to one optimization problem at a time sounds reasonably straightforward, but some factors inhibit this in organizations today. Chief among them is the lack of real strategic impetus. This is because to embrace a new way of thinking, it’s often necessary to dispense with the old. So this hurdle is one of culture, as a new idea contends with the status quo.
Another major challenge facing the organization seeking to embrace data for better decision-making is the sheer state of existing data infrastructure. True, data is ubiquitous. But also true is the fact that it is extremely messy. It’s not uncommon for organizations to have multiple information systems, each suited to a specific purpose. Since these systems need to talk to each other, it’s also not uncommon to have a set of integrations that facilitated critical information flows between systems. Here’s the cherry on top: because most organizations’ IT infrastructure has been built incrementally over extended periods, they utilize different technologies and skillsets. The resulting fragmentation means that no one has a complete command of the data infrastructure, making it even harder to implement organization-wide change. In short, organizations are unwilling to alter the delicate balance in their infrastructure because the cost of disruption would seem prohibitive.
The Opportunity
The changes needed to bring data into the decision room are understandably agitating – or downright disruptive – to many leaders. There’s a reason for this. It’s easier to appreciate the relevance of data from a safe distance without truly yielding to the idea that it can help us make better decisions.
That said, the opportunity is real and tangible. When we consider the use cases – and the optimization problems of each one – these opportunities become a lot more visible. Here’s one example.
We are still in a global (COVID-19) pandemic. But little is known that some of the earliest warnings came from the use of Data Science and predictive analytics. According to Wired, the Bluedot algorithm provided some of the earliest warning signs. It “correctly predicted that the virus would jump from Wuhan to Bangkok, Seoul, Taipei, and Tokyo in the days following its initial appearance.”
“Bluedot uses an AI-driven algorithm that scours foreign-language news reports, animal and plant disease networks, and official proclamations to give its clients warning to avoid danger zones like Wuhan.”
Here’s a business model built on data science and artificial intelligence, working at its best.
Frankly, this opportunity is not the same for every organization. It would be presumptuous to suggest all organizations should start scrambling to mature their data capabilities to gain advantages.
Also, not every problem is one of optimization. Hence the question “What is the data saying” may not apply to problems that are rooted in ethical, social, humanitarian and political issues to name a few. They are a different beast. It’s important to appreciate that data is not a magic wand.
The Trends
As organizations seek to master their data, the strategic implications are noteworthy. The sheer volume of data today means there’s hardly any problem we cannot put to it. With insights becoming more pervasive, the search for new sources of differentiation and competitive advantages will fuel a quest for even deeper insights and more data. In theory, this should level the playing field between economic participants, reducing (or even eliminating) problems of informational asymmetries. This should also lead to more efficient markets. In practice, we are a long way from this point, as the existing data is not optimized for decisions (it’s a digital mess) and capabilities for processing data are far from mature.
To make sense of data, its consistency (both in format and meaning) is important. We are already seeing the emergence of conventions that promote this idea for the greater good. An example is the IFRS Foundation, the body responsible for developing a single set of high-quality global accounting standards. The IFRS’s taxonomy project (which is a way of tagging electronic IFRS compliant reports with disclosure elements within IFRS standards) is a digital leap in financial reporting. Its stated goal is:
“to facilitate electronic reporting of financial statements prepared in accordance with IFRS Standards”
Given the wealth of data points from consistent electronic financial reporting, we can expect opportunities for insights that go beyond the reporting entity’s jurisdiction and industry. This is the promise of data consistency.
Closely linked to the theme of consistency is data integrity. It speaks to its validity and accuracy, attributes that qualify data for use in decision-making. Suffice to say, not all data is created equal. A large amount of the work that happens with data is about giving it the attribute of integrity and presenting it consistently to enable comparison. This due diligence is required to ensure that data is fit for purpose. It happens before we engage in analysis – the art and science of making data make sense. In other words, helping data transition from messy to meaningful. The due diligence forms the bulk of a typical data scientist’s work.
Last but – certainly not – least on focus areas is security. Digital information is inherently more susceptible to breach. Indeed, data security is a fully-fledged industry, complete with governance conventions, technical protocols, and competence certifications.
It would seem as if the trends in data are at odds with each other. For there is a trade-off between the benefits of consistency and the risk of breach. Having data consistently formatted scales its usefulness substantially. But this also exposes data to the risk of compromise as more participants adhere to a single convention for formatting, securing, and presenting it. Could this be yet another optimization problem?
In Conclusion
When Tim Berners-Lee presented the concept for what would later become the world wide web to his manager Mike Sendall, the feedback in annotations said it was “vague but exciting.” Now here we are in an era where we use data to solve real-world problems, in real-time. The cost of acquiring and turning large volumes of data into actionable insights is at its lowest yet.
The possibilities are endless, but it’s hard to know for sure, where they will lead us. In many ways, the outlook is still “vague but exciting.”