Automatic feature extraction or image classification using artificial intelligence (AI) algorithms has been successfully applied in several domains, for example, in recognizing landmarks, dogs, cars and other objects in consumer photos and detecting building footprints in high resolution satellite imagery. The existing algorithms for doing this are often available in open source implementations and have been democratized to the point where they are available to anyone with a modicum of programming skill. The challenge is creating and maintaining (curating) the training datasets that are required to turn the basic algorithms into models that can automatically recognize and classify different types of objects. At the December Open Geospatial Consortium (OGC) Energy Summit at EPRI in Charlotte, Andrew Phillips of EPRI described EPRI's role in the new world of big data which includes curating open training datasets for the electric power industry to be used by the AI industry including universities, startups and established software players to create applications for electric power utilities.
Deep learning is increasingly finding practical applications using different types of georeferenced imagery. For example, convolutional neural networks (CNN) have been successfully applied to real world problems including detecting blocked canals, differentiating different types of roofs, identifying building footprints and recognizing transportation networks.
The original application of deep learning to photo imagery was by Geoffrey Hinton at Google who used it to distinguish dogs and other objects in photos that people uploaded. The basic process in creating these models is to find or create some training data, some data to test the model, and then play with the parameters to optimize the model for the test data. Google is pretty good at recognizing dogs, cars and other objects that are commonly found in consumer photos. But if you present Google with photos of utility equipment such as a photo of a decayed wooden pole with an insulator precariously hanging off it or a cracked fuse, Google will not recognize these objects correctly. It is not a problem with Google's algorithm, but a lack of training data - photos whose content has been "labeled" by electric power experts.
An essential piece that is required to open up the data and technology of deep learning to a broader audience is open labeled data, what has been called an "open labeled data commons". There are analogies with OpenStreetMap initiative - something like this is required for labeled data - it needs a repository of open data, tools to capture and manage the data, and a community around this to capture and curate that data. For example, for automatically classifying land use in satellite imagery the System for Terrrestrial Ecosystem Parameterization (STEP) has 2000 manually labeled sites covering 17 different land cover types that can be used a training data for deep learning applications to land use.
EPRI has already been active in collecting large volumes of training data from members and curating training datasets that can be used for common industry use cases. For example, EPRI has pulled together dissolved gases data for 45,000 transformers from 27 utilities which includes a few thousand failures. The data has been provided to AI vendors to train their models. Some of the data has been held back to test the models to see how well the vendors have done.
In another experiment EPRI asked 12 experts to review inspection imagery to identify and rank problems with common electric power equipment. Thousands of labeled images were provided to nine vendors including IBM, Intel, Harris and others to train their deep learning models. EPRI held back 500 of these images for testing and have just completed running them through an evaluation process to see how well the models do in identifying problems. For example, it was found that the best models could identify cracked and damaged insulators with 88% confidence (using Matthews correlation coefficient), but cavities in wood poles such as woodpecker holes could only be identified at the 50% confidence level. The results have revealed more images are needed under different lighting conditions. EPRI is going back to its members to get more images. Another area of interest to EPRI is monitoring transmission lines including vegetation encroachment using imagery from BVLOS UAV flights.
EPRI's strategy is to be the bridge to the AI community by making the AI community aware of use cases that are important for the electtric power industry, helping the AI community with the physics of electric power and supporting structures, helping its members understand how AI can be used in the electric power industry, and collecting and curating open training datasets to be used by the AI community,