Deep learning is increasingly finding practical applications using geospatial data, specifically, satellite imagery. At HxGNLIve in Las Vegas, Wim Bozelie, Technology Director at Imagem, described some very practical cases in which machine learning, specifically, convolutional neural networks (CNN) were successfully applied to real world problems including detecting blocked canals and differentiating different types of roofs.
The original application of deep learning to photo imagery was by Geoffrey Hinton at Google who used it to distinguish dogs and other objects in photos that people uploaded. The basic process to create these models is to find or create some training data, some data to test the model, and then play with the parameters to optimize the model for the test data. There is a long-running competition Imagenet Large-scale Visual Recognition Challenge (ILSVRC) which provides training and test data for evaluating algorithms for object detection and image classification at large scale. A motivation for this public challenge is to allow researchers to compare progress in detection across a wider variety of objects by providing publicly available, relatively expensive data. The validation and test data consist of 150,000 photographs, collected from flickr and other sources of imagery, which have been hand labeled with the presence or absence of 1000 object categories. The object categories are very detailed for the types of objects found on facebook, flicker and othe social media. Some examples,
dial telephone, cell phone, airliner, warplane, airship, balloon, space shuttle,aircraft carrier, Tibetan mastiff, Doberman pinscher, warthog, Arabian camel, dromedary, golden retriever, border collie, boxer, english springer, malamute, malemute, whippet, weimaraner, husky, brittany spaniel, lion, beaver, irish setter...
This training data is not much use for geographic data where typical applications of satellite imagery are for landuse and agricultural identification. Wim talked specifically about a type of deep learning called convolutional neural networks (CNN). For example, remote sensed imagery from a satellite can be used to create a model that will differentiate between corn and potatoes, using factors that can be calculated from satellite imagery such as normalized difference vegetation index (NDVI) (min, max, mean), texture (min, max, mean), vegetation height, and geometric factors such as orientation. I blogged about how open source code and publicly available training data has been applied to track deforestation and reforestation in Mato Grosso, a state in the central amazon region of Brazil. Wim reported a study in the Netherlands using CNN to identify blocked waterways using overflight imagery. An approach that did not use machine learning was tried first, but it found many false positives and missed some real instances of blocked waterways and the success rate was about 50 %. When machine learning was applied with reliable training data, a success rate to 97 % was achieved.
An important application of satellite imagery is identifying built structures. I have blogged about a competition to use satellite imagery to identify buildings and transportation networks. The winner applied a deep neural network model developed originally for medical image segmentation called U-Net. Reliable training data was provided by the competition organizers. The winning solution used OpenStreetMap layers and high resolution Worldview multispectral layers as the input of the deep neural network algorithm. Using the same data NVIDIA demonstrated the ability to automate detection of many road networks using deep learning algorithms and multi-spectral high resolution imagery. Wim described another application of CNN and imagery in the Netherlands to differentiate residential roofs with dormer windows from those without and discussed how different aspects of the model, such as convolution layers, can be manipulated to improve results.
The essential key to effective application of deep learning is good training and test data - ground-truthed data that involves, for example, someone on the ground identifying whether the fields seen in the imagery are corn or potatoes or something else. There are publicly available training data sets that can be used to train satellite imagery CNN models. For example, I blogged about the System for Terrrestrial Ecosystem Parameterization (STEP) dataset which has 2000 manually labeled sites covering 17 different land cover types scattered across all continents. Wim described a large publicly available training dataset for Sentinel 2 imagery, which was developed by Eurosat. It contains 30,000 polygons of landuse training data for ten classes of landuse: annual crop, forest, herbaceous vegetation, highway, industrial, pastures, permanent crop, residential, river and sea or lake. In an experiment Wim took 500 training polygons from this dataset for three types of landuse: built-up, pastures and vegetation, and water, to build a CNN model. He applied the resulting model to 25 test areas. Of the 25 test areas the model correctly identified 23, but two were incorrect.
AI has often suffered from inflated expectations, complex code, heavy processing requirements and not very practical applications. There is a growing body of practical applications in the geospatial domain that shows that as a result of the immense processing power available in the cloud machine learning is able to generate fairly easily practical and useful results. As Chris Holmes pointed out in his talk at FOSS4GNA, the challenge now is to develop reliable and publicly available training and test datasets to enable deep learning models to be created for a broader range of applications using the huge volumes of geospatial data that are now available.