Finding Value in All Data Types
March 29, 2021, 11:00:00 PM
One of the key factors in determining whether companies will thrive or fail in the next five years is how well they use the data they have available. This is a problem for many that may not even know what data they have, let alone how to use it or what insights it may contain.
Business processes often involve creating or capturing data in a way that is siloed and difficult to access, analyze or act on outside of the process for which it was created. Even today, many business processes are reliant on physical record-keeping – note-taking, filling out paper forms, or ticking checkboxes on hard copy documents that are then filed away and forgotten about.
Even if all a business’s procedural documents and record-keeping is digital, the information is of little value unless careful thought is given to the data structure, format, and storage media that will be used. If it isn't, the potential for it to be used across the enterprise to unlock value and drive efficiency becomes severely limited.
This data, that we either don't know we have or for whatever reason are unable to put to use, is known as "dark data." Just like dark matter, which physicists believe may hold secrets that will help us understand the universe (if only we knew where to look for them), dark data is a potential treasure-trove of knowledge and understanding.
One business that understands this and has taken up the challenge of helping other businesses to tackle it is Iron Mountain. This 70-year-old information management company originally offered secure storage services, in vaults buried within disused mine shafts (hence the name). The concern back in those days was that some documents had to be safe even in the face of a cataclysmic event such as a nuclear war. Today most organizations are more focused on making data accessible and usable than they are on locking it away – however, Iron Mountain still takes delivery of millions of cubic feet of documents for safe storage every year.
To bring these aspects of its work together, it has developed a service called InSight that uses artificial intelligence to help its customers find and leverage their most valuable data, however deep and dark it may be. I spoke to Raymond Aschenbach, head of global digital sales and operations, about this transition and the potential it creates for its customers today.
“There’s a lot of value that lies in dark data that companies are trying to create opportunities from,” Aschenbach tells me.
He likes to use the oil and gas industry as an example of where companies may have very valuable data assets that are hard to gain insights from, due to changing standards of recording and storage over the years.
Huge numbers of charts are built, and data on everything from seismic activity to meteorological measurements are taken when exploring and drilling new wells. This could be very helpful for future prospecting and evaluation of drilling opportunities, but a lack of consistent recording and archiving standards means much of what has been learned in the past is "dark."
Another prime example is the insurance industry. Over the years, methods of archiving contracts and claims data have constantly evolved, even down to the language and terminology used. Full access to the breadth of knowledge established over decades of business would be hugely valuable for providing more efficient coverage and claims.
Aschenbach says, "A lot of large companies today have document-centric work processes and there's a lot of data sitting in those documents – and it can be illuminated to create value … you can use artificial intelligence tools that are available today, and you can access that information."
Iron Mountain is already putting these principles to work, he told me. "In the past, an individual in a car accident would go to an auto body shop, they would get an estimate, it would be sent off to an insurance claim processor, who might come and look at the accident themselves – nowadays using both historical data and current tech, just a photo can be snapped, that photo can be uploaded to the cloud, you can access historic information of similar accidents, and the cost, to quickly come up with an estimate, within seconds."
A great deal of this dark historical information comes in the form of pictures, videos, and even voice recordings. In the past, this would be very hard to analyze due to its unstructured data – without manually reviewing, say, every photograph in an insurer’s database of accident pictures, there would be no way to include that information in any computer analysis.
But now, AI techniques like machine learning, computer vision, and natural language processing (NLP) mean that this type of data can be analyzed and broken down into its constituent parts. Accident images can be grouped by geographical area, make and model of car, time of day, or severity of impact, with no human intervention. Likewise, a database consisting of years' worth of customer service phone call recordings can be analyzed, and each call classified according to whether the caller is happy or angry, the subject they want to discuss, and whether or not there was a successful resolution.
As might be expected, these types of services are eagerly adopted in banking and finance too. The mortgage industry leans heavily on physical documents, many of which were created decades ago when assessing portfolios of loans for sale or purchase. Here, AI methods of data extraction from hardcopy documents have reduced a process that formerly took months to merely weeks, or sometimes even days, Aschenbach says.
Of course, it's not always plain sailing, and there are two specific challenges that need to be addressed by any company looking to capitalize on its hidden data assets.
One is sprawl – the sheer volume and variety of data and the many locations and systems that have been used to gather and store it. This is where AI solutions like those discussed above come into their own, but a strategic consideration of what data is really required, and what we intend to do with it, is still an essential step of the process.
The other comes from the technology itself and the fact that so much of the information is held in legacy systems where it can't be accessed by those solutions. Overcoming this involves "modernizing the environment," says Aschenbach:
“Most content sits in legacy systems … there’s not a lot of functionality to bring the value that modernized cloud environments bring … that’s the current challenge for many organizations as they look to plan next steps in creating more value and opportunity, placing cloud adoption and platform modernization as a priority.”