Keeping Analytics' Fundamental Promise And Obligation
Help people understand the information contained in the data that matters to them.
Do so as quickly, effectively, efficiently, economically, and securely as possible with a minimum of technical intervention, fuss, and bother.
Empowering people with information
Data contains information that people can benefit from understanding. The point of analytics is to help people understand the information content of the data that matters to them. Some data is in a form that people can see, and therefore analyze, directly. Electronically stored data requires technological support so that people can access, see, analyze, and understand it.
Fundamental data analysis, the foundation of everything data-analytical, is achievable, simple, and straightforward with modern tools when they’re used appropriately, enabling non-IT people to analyze their data themselves, and IT people to analyze their data much more quickly and effectively than using legacy code-based (e.g. SQL query) approaches.
More complex analysis, with no real limit to its breadth, depth, and sophistication, is a necessarily technical undertaking requiring an appropriate combination of the analyst's analytical abilities, technology support for the analytical operations, and the technical expertise to use the technology for the intended purposes.
Rethinking Analytics
Conventional analytics has a legacy of underachievement. Even after more than sixty years of experience far too many analytics efforts have fallen short of fulfilling analytics' fundamental promise and obligation. People are largely unable to access, analyze, and understand the information contained in the data that matters to them. There are real, substantial reasons why this is the case, but it doesn’t have to be—there are opportunities available to avoid conventional analytics’ problems, and achieve analytics success.
People, and organizations, have a choice, although they often aren’t aware of it, and sometimes dismiss it when it’s offered: continue to approach and conduct analytics as in the past, in which case the same problems and shortcomings can be expected; or rethink how analytics can be conducted and in doing so help people achieve the real benefits that come with understanding their data, and help organizations fulfill their obligations.
Rethinking analytics involves two main aspects: understanding how and why conventional analytics was shaped into its current form, and why in this form it’s been unable to provide the benefits it should have; and contemplating how analytics can be conducted so that it provides the necessary benefits while avoiding the pitfalls that have plagued conventional analytics
There’s good news: it’s possible to conduct analytics so that people are well served, and organizations are able to honour their obligations. The even better news is that, done well, analytics can be conducted far more effectively and efficiently, i.e. much faster and less expensively, than the with conventional approach.
The common thread running through these failures is the reliance upon a paradigm that feels reasonable and natural, even prudent, to those who adopt it—to the degree that people are aware of the paradigm they're enmeshed in—but in practice has problems and deficiencies.
Successful analytics is achievable when it’s conducted within an appropriate paradigm, which requires rethinking what analytics is, who it’s for, and how to do it.
What Analytics Is
Analytics is in its essence the the human cognitive, intellectual analysis of the information content of electronically-stored data in order to understand it. As humans are unable to directly perceive electronically-stored data analytics is dependent upon, and supported by, computer-assisted data analysis. This definition raises some essential questions, including: what is information; what is data; what is analysis; what is data analysis; and how do computers assist in data analysis? Each of these is briefly addressed below, with links to additional content when available.
Analytics refers to both the activity of analysis of data and to the physical forms, e.g. charts, graphs, maps, dashboards. scorecards, stories, etc., and media, e.g. images, HTML pages, PDFs, etc. used to visualize, see and comprehend the data, and to communicate information and insights to others.
Analytics is distinct from automated systems that operate flexibly in response to the available data. Automated data-responsive systems are dependent upon and employ technical aspects of data analysis, but to the degree that they operate autonomously they're outside the scope of analytics.
Analytics That Works
Is a way of doing analytics so that people are connected to their data as smoothly and intimately as possible, fulfilling analytics fundamental promise.
It achieves this by concentrating on achieving analytics' essential objectives and obligations:
- helping people understand the information in the data that matters to them quickly, effectively, efficiently, and economically;
- helping organizations support their peoples' analytics needs at all appropriate scales while ensuring that the organization's data and information are properly safeguarded.
Analytics that works is based upon several simple truths, including:
- Analytics is universal. Wherever there is data there is the opportunity for people to analyze it.
- Analysis of data is the beginning and end of analytics; the first and last activities. Analysis is required in order to accomplish any 'higher' level activities.
- Fundamental analytics—identifying and analyzing data's fundamental characteristics and relationships, is the foundation upon which analytics rests. Analytics begins with the understanding of data that fundamental analytics makes available; it ends with conveying the fundamental aspects of sophisticated analysis; and it pervades the entire spectrum of technical data processing and analysis activities.
- Technology is required, but only to the degree necessary to support people in their analysis of their data. Technologies that match human cognitive and intellectual abilities are better than those that require people to adapt themselves to the technology's design.
Analytics is for: everyone who needs to understand the data that matters to them.
Analysts
Everyone who does or could analyze the information content of data that matters to them. Ideally, everyone could effectively fully and effectively analyze their data. In practice, there are limits to the degree to which people without highly specialized skills can conduct their own analyses of their data.
Conventional analytics imposes an artificial segregation upon the different roles or positions people occupy. The most egregious failing is that analytics is organized around delivering constructed artifacts--dashboards, scorecards, reports, etc.--to consumers who consume the information the artifacts contain.
Analytics that works explicitly recognizes that everyone who can benefit from knowing what’s in the data that matters to them is an analyst, and is best served when the information is readily accessible to them.
about
Knowledge Workers
Are Analysts in that they need data-based information in order to do their primary jobs. Individuals have different cognitive, analytical, and technical skills, and will therefore require different kinds and levels of technical support in order to access, analyze, and understand their data.
Conventional analytics frames knowledge workers as the proper audience for institutional analytics efforts, and assumes that they are incapable of being able to ‘properly’ analyze their data, and so must be restricted to accessing data through authorized channels and analyzing that data through authorized mechanisms.
Analytics that works explicitly recognizes the primacy of knowledge workers information needs and provides each with the tools, technologies, and support required to satisfy these needs, with only as much friction and overhead as required, and as few barriers as necessary.
about
Technologists
Work with the data-analytical tools and technologies that require specific technological knowledge.
about
Managers
Project Managers
Administering the analytics efforts for discrete projects, PMs are instrumental. Effective PMs are also practitioners; non-practitioners often are more obstacles than facilitators.
Executives
Responsible for establishing and executing the organizational analytics strategy.
about
Media
Advisory media tend towards being targeted to executives who can justify the expense of the media’s market research. The main drawback is that the markets being researched are dominated by executive populations consisting of executives, leading to a self-perpetuating feedback loop that entrenches the prevailing paradigm. In essence executive advisory media tell executives what other executives have been spending money on.
Popular media can be more flexible and adaptive to covering and assessing innovations in analytics, so they are at least nominally better positioned than advisory media to recognize and cover analytics aspects that aren’t constrained to the status quo. However, they tend towards covering analytics tools in terms of how well they implement traditional analytics functionalities, rather than how they serve the foundational cognitive and intellectual needs and abilities of analysts.
Tech Vendors
Organize themselves around maximizing their revenues and minimizing their development and product support costs. This generally involves increasing the scope and complexity of their products, which even if they’ve started out as focused innovations, migrate towards becoming larger and more expansive platforms commanding ever-more expensive licensing costs.
Consultancies
Particularly for large consultancies, their interests are centered around maximizing the amount and predictability of their consulting fees, so they are vested in promulgating the concept of analytics as large scale, expensive, labor intensive initiatives with long lead times and continuous expenditure of expensive consulting hours.
Analystics International is dedicated to helping people achieve the benefits analytics can provide. Our core belief is that the information content of data should be available to the people who need to comprehend it with as little friction as possible.
Analytics is:
In its broadest sense analytics covers everything related to analyzing data in order to understand the information it contains via machine-assisted data analysis. Its essential element is the human cognitive, intellectual activity of analysis. This begs the questions: what is data; what is analysis; what is analysis of data; and how do machines assist in data analysis? Each of these is briefly addressed below, with links to additional content when available.
Electronic computers are the dominant machines assisting data analysis in the modern world, but machine-assisted data analysis predates electronic computers by more than half a century. A signal event heralding the origins of analytics was the use of Herman Hollerith's tabulating machines for the compilation of statistics for the 1890 US census.
Modern analytics emerged with the widespread adoption of electronic computers by businesses, beginning with the 1960s, accelerating during the 1970s, and exploding with the advent of non-mainframe computers in the 1980s and 1990s. Previous incarnations include reporting, decision support systems, management information systems, and most recently, business intelligence. For more see: a brief history of analytics.
The principles of analytics are the same no matter which type of machine is used.
Analytics in its noun form refers to data analyses rendered in their various forms, e.g. charts, graphs, maps, dashboards. scorecards, stories, etc., and media, e.g. images, HTML pages, PDFs, etc..
Analytics is distinct from automated systems that operate flexibly in response to the available data. Automated data-responsive systems are dependent upon and employ technical aspects of data analysis, but to the degree that they operate autonomously they're outside the scope of analytics.
Data is at heart quite simple: it's information recorded in some media. People have been recording information for many thousands of years. Until recently the information has been directly perceptible by, if not intelligible to, people; data in the modern world is overwhelmingly stored in media and encodings that require the use of technology to store, retrieve, perceive and analyze it.
Analysis is the active contemplation of information, usually with the objective of understanding the nature of the information. There are multiple levels of analysis, increasing in compositional complexity, beginning with an identification of the information's atomic components, followed by enumerations, aggregations--counting, summing, etc., correlations, and so on. Analysis is a human intellectual skill that everyone possesses to some degree.
Data analysis is the analysis of the information content of data. It requires the analyst to exercise their cognitive and analytical abilities to perceive and contemplate the information. Except where the data is directly perceivable by the analyst technological support is required, with complex data and analysis requiring advanced technological support.
Some analyses are so simple, straightforward, and natural they may seem hardly worthy of considering 'real' analytical operations, e.g. identifying objects by name, enumerating, sorting, and counting them. While these may seem trivial they are the basis of analytics in the same way arithmetic is the foundation of mathematics.
Similar to mathematics, analytics is massively broad and deep; there is no effective horizon limiting the scope of analyses that can be brought to bear for even small or moderately complex data sets. Basic statistics—counts, totals, averages, minimums, maximums, etc.—are simple, familiar to everyone, and easily obtained. More advanced statical analysis quickly becomes a technical undertaking that requires specialized knowledge and skills to understand and undertake.
Machines assist human data analysis by supporting the analyst in the perception and analysis of data's information content. The machines make it possible to access, manipulate, and present the information in a form the analyst can perceive, e.g. showing digital data as characters composing words and numbers, often organized into structures such as lists, tables, etc.. The abacus is an example of a data-analytical machine in that it represents and and provides the ability to manipulate numbers. There's practically no limited to the depth and degree of analysis that machines make possible; even in those cases where analysis is are easy and reflexive for people, e.g. speech, shape, or face recognition, but difficult for machines to achieve, once machines are able to conduct the analysis they're able to process volumes of analysis vastly beyond human abilities.
Rethinking Analytics
Considering its pedigree it's tempting to think that analytics has evolved, matured, and become proficient at delivering on its fundamental obligation: helping people understand the information contained in the data that matters to them. The general position of analytics luminaries and vendors is that this has been and is the case, with some usual caveats about how the availability of new data sources, and analytical techniques and technologies require the adoption of new approaches and tools.
The reality is that far too often analytics has been and continues to be done poorly, especially within enterprise contexts where Business Intelligence has been the dominant, nearly universal paradigm since the 1990s. Conventional BI has consistently failed to deliver success; searching the web for "business intelligence failure" will produce many resources covering BI's problems, often with suggestions for addressing them. Yet, even though BI's shortcomings have been known for quite a long time it has continued to be disappointing, and the often-proposed solutions don't solve the problems.
Are conventional BI's problems so deep that they cannot be overcome?
No, they are not.
It's possible to conduct analytics in a way that works, and works well. But first, it's helpful to identify the problems with conventional BI that prevent it from succeeding. A review of the results of a web search for "business intelligence failure" resources is informative. Please take your time, we'll wait.
The resources identify numerous problems with BI, with quite a lot of overlap among them. However, and this is the critical point: there is a very strong alignment among these references to a particular paradigm which is accepted as the unquestioned, unexamined, unrecognized framework within which all things exist (that's what paradigms are). The fundamental problem with conventional BI is not necessarily that particular elements of its set up and conduct are at fault, but that the paradigm itself is unsuited to the problem domain it's applied to.
In principle BI is capable of encompassing the full spectrum of data-analytical needs, within its realm of operation. However, as BI became embraced and adopted, particularly within large organizations, it was subjected to conventions—ideas, opinions, practices, etc.—that became entrenched and prescribed what BI was and how it should be conducted. These conventions are the root causes of much of how BI has strayed from its original promise.
Conventional BI's paradigm includes a constellation of misalignments with analytics' real goals, obligations, opportunities, and constraints. Each misalignment has its own sources including conceptual and operational biases and limitations. Individually they negatively impact BI's operation; in combination they interoperate in feedback loops, reinforcing one another and resulting in systemic, often substantial, even paralytic dysfunction.
A brief overview of conventional BI's paradigm.
Scope and scale.
Scope - what data is subject for analysis, and what analyses are appropriate, and from what sources
Scale - how analytics is organized, managed, conducted
get-started cost / cost of entry / buy-in / ante
One of these problems is the conception and practice of data analysis as a technical activity first, and as a human sense-making activity second (or worse). Another is the industrial production model of work focused on building and employing factories to ingest inputs, processes them with completely designed, managed, and operated processes that produce large numbers of similar or identical outputs. Yet another is the hierarchical business organization model that works hand-in-hand with the others to shape the presumptions of who needs to know what and what it takes to inform them.
Scale
Analytics is valuable and worth pursuing at the largest scale: addressing and answering the biggest questions for the most senior people.
There's very much a chicken-and-egg relationship between analytics' perceived purpose and value, and the forces that led to its framing in this manner.
On one hand there's the very real situation that the business systems data has historically been undecipherable to non-technical people, largely due to the combination of relational data modelling and the technically-designed tooling required to access and analyze it, for example the near-absolute dominance of SQL-query based approaches.
This led to the acceptance of the necessity for technical support, along with the expenditure of substantial time, energy, money, and other resources required to obtain information from business data.
Once the need for substantial investments was accepted as a threshold, it became simple to accept that the investments were best justified by providing answers to the biggest questions, those at the top of the executives' strategic concerns.
This fed back into the mix, increasing the demand for more and more data to be included in order to better answer the strategic questions, leading to an escalating spiral of demand for more data along with the ever-larger, more complex and expensive platforms and the staffs required to manage them.
One unfortunate consequence of enterprise analytics' conception and evolution is that the conduct of analytics is too often left up to technocrats, reinforcing the technology-centrism framing, building walls of technology and technocratic bureaucracy that impede rather than facilitate people's access to the information contained in the data that matters to them.
A common manifestation of these flaws is in the process of analytics expressed as some form of "first thing: prepare the data for analysis."
Implicit in this is the idea that analysis is an end-point activity done to make processed data intelligible to relatively passive information consumers.
This is completely the wrong way of thinking. There is no way to do anything meaningful and useful with data unless and until the data is understood, and the only way to understand it is through analyzing it. The follow-on effects of this mischaracterization are structural, pervasive, and once in place extremely difficult, but not impossible, to recover from.
Enterprise analytics' dominant paradigms are briefly described below; they are examined in detail, along with opportunities for remediating their flaws in auxiliary content. For historical reference, see A Brief History of Analytics.
Conventional Business Intelligence: too often disappointing.
Business intelligence is the dominant modern form of enterprise analytics. Once it gained traction in the 1990s BI became the dominant analytics paradigm owned, managed, and controlled as the exclusive domain of technocrats; where non-technical people are dependent upon technocrat-provided services. Rather than actively supporting people, conventional BI became the place where data went to die. Conventional BI's failings are well documented; a web search of "why BI projects fail" will provide a multitude of resources. The main flaws in conventional BI stem from its conceptual and operational biases, primarily: identifying what data is suitable for analysis; how data is analyzed; and how data analysis is organized and managed. For more, see: conventional BI is broken
Responses to data-analytical needs have emerged since BI became the standard approach. Multiple factors have influenced the new approaches, including: BI's shortcomings within its nominal realm of operation; the evolution of new data sources that don't fit neatly into BI's model; and the evolution and adoption of advanced technical analyses. These new analytics approaches can be grouped, roughly, into the following segments.
Data Science: the future of analytics? Maybe, but let's hope not.
In one framing, data science in its various incarnations—advanced statistical analysis, artificial intelligence, machine learning, etc.—has great promise in finally making it possible to analyze all the data available in order to answer critical questions and make important decisions. It seems that the effective application of data science's tools, technologies, and techniques of data science can achieve the results claimed. However, there are a couple of things to bear in mind: these same claims were made by the advocates for BI, asserting that by properly building comprehensive data warehouses and fronting them with analytical platforms all of an organizations data-based questions could be answered; and that, like BI, data science seems to be concerned with addressing and answering the big, deep, complex, strategic questions that are the province of serious, senior people who need to make the big decisions. Data science, like BI before it, seems not to recognize that fundamental data analysis is both the first and last element in any program of analytics that works.
What about Big Data, the Internet of Things (IoT), etc?
In a very real sense Big Data has been a marketing term used to imply that previous approaches to analytics, notably BI, were not up to the task of dealing with the increasing volumes, velocity, and complexity of emerging data sources, and that new tools and technologies were needed to accommodate the new data world. There is a kernel of truth in this, but it misses the fundamental truth that conventional BI was incapable of dealing with the data in the pre-Big Data world due to its implementation paradigm (conventions) far more than in the nature of the data or the capabilities and capacities of BI tools and technologies.
The IoT has accelerated the generation of data by internet-enabled devices; the amount of data being captured is phenomenal in its quantity and breadth. The reality it that for all practical purposes the IoT is of little matter to the vast majority of people who need to understand their data, and that the fundamental techniques of analytics are as appropriate for the IoT data as for all other data sources.
cc
Fundamental Analytics is the cornerstone of Analytics that Works. When conducted effectively, fundamental analytics provides the opportunity to minimize the distances between people and data. This makes it possible for everyone to see and understand the data-based information that's relevant to them quickly, effectively, efficiently, and economically.
Major factors determining effectiveness
The person's cognitive and intellectual analytical abilities
The tool's support for the person's abilities
, and 2) modern tools change fundamental data analysis from a technically- to a human- oriented activity. In doing so, fundamental data analysis moves front and centre in the spectrum of analytics' activities, and provides the opportunity to minimize the distance between people and their data's information content.
the basic characteristics of a dataset
counting, enumerating, cross-referencing
basic statistics -- counts, averages, min/max. etc
Can be conducted with a wide variety of tools and technologies
Conventional BI employs a techno-centric paradigm: (SQL) query -> resultset -> manipulation -> output
Modern human-oriented active data analysis tools use a visual metaphor matching human cognitive and intellectual abilities to change the analysis process to direct-action visualization
Cognitive load -- tools
Intellectual ability
Analytics is for People
Analytics has value only inasmuch as it helps people understand the information contained in the data that matters to them. It absolutely must accomplish this, otherwise there's no point to it. People also need to be able share the information and insights they obtain with others.
People need to be able to accomplish these things simply, effectively, efficiently, economically, and with a minimum of technical intervention. Each of these points is essential, and they are mutually interdependent.
Intimate analytics is the term used to connote the situation when these ambitions are achieved and an intimate analytical relationship between someone and the data that matters to them exists.
Analytics and Organizations
Organizations are made up of people, the great majority of whom can benefit from understanding the data that's relevant to their areas of responsibility and interest.
Organizations also have interests that are a superset of individual people's interests. The most obvious of these are the need to ensure that access to data and information is managed appropriately, and that multiple data sets need to be analyzed collectively to obtain larger-scale information and insights.
Enterprise analytics is the term used to describe this aspect of analytics.
The role of technology
Technology is necessary to store, access, analyze, and understand data. From an analytical perspective data storage technology is fundamentally uniform in its technical characteristics—storing bits/bytes/words in persistent media, although there is a very wide variety in the organizational schemes employed, e.g. relational, hierarchical, and graph/network structures. These schemes present distinct opportunities for enabling analysis of the data. They also impose constraints, notably in the degree of technical specialist skills required.
In data-analytical terms it's helpful to consider the scope and scale of the various technologies available to assist people in analyzing data. We identify two primary technology classes.
Personal tools support direct, immediate connection to and analysis of data where it lives. The best personal tools are designed to support human cognitive and intellectual abilities with a minimum of fuss and bother. The perfect personal data analysis tool would be effectively invisible to the person using it. There is no perfect tool, but there are some very, very good ones. Lesser tools provide technological mechanisms for accessing, manipulating, and visualizing data; one common characteristic of many of these tools is their basis in SL querying, effectively limiting their effectiveness:
- operationally, to those people with adequate SQL skills, and
- functionally, putting querying in the first-action position.
Platforms are capable of supporting everything involved in enterprise-scale data acquisition, management, and analysis. Platforms emphasize the large scale, industrial, mechanistic processes required to harvest, homogenize, store, manage, and control access to data and the analyses of it. The degree of support platforms provide for intimate analytics varies widely. Platforms are overkill for many or most of the real world needs people face.
Tableau as a Keystone Technology
Tableau was created as personal analytics tool to help people see and understand data. Its inventors' genius innovation was the creation of a human-oriented UI that makes it simple and easy to construct effective visualizations of whichever permutation of the fields the analyst selects. Tableau made it possible to connect to and analyze data (almost) at the speed of thought.
Tableau was transformational; for the first time non-technical people could access and analyze their data using a tool that matched their cognitive and intellectual models, making it simple, easy, and straightforward for them to understand the information content of the data. When it was introduced Tableau was the best human-oriented personal analytical tool on the market. It remains unsurpassed in this space.
As it has matured Tableau has concentrated on implementing platform features to make it more palatable and attractive to enterprise customers. It has become a legitimate analytics platform, supporting the full range of traditional enterprise analytics needs.
The combination of support for highly effective personal analytics and robust enterprise analytics make Tableau a superior analytics choice for individuals and organizations of all sizes.
So Much Potential
Data is everywhere, and it's increasing in reach, volume, breadth, depth, and complexity.
Tabular data originating in operational systems–internal, external, or hybrid–has been the primary objective of traditional conventional Business Intelligence style analytics. But it's been joined by other types and sources of data as proper subjects for analysis. No-SQL data is becoming increasingly visible and significant. And, as always, local, small scale data forms vast, rich stores of information that permeate every organization. On top of these, in a return to the past, hierarchical data sources are again becoming legitimate targets for analysis, along with their network/graph cousins.
Tools, technologies, and techniques for analyzing data, in all its forms, are available to help people understand the information captured in their data. Leveraging these well leads to success.