Just about every industry from construction to utilities to roofers to insurance is being impacted, and often transformed, by the availability of high resolution Earth observation (EO) imagery from satellites, planes and helicopters, and UAVs. Initially 2D, then 3D and in 2014 4D (2D/3D + temporal) imagery is available from many sources. Imagery from international satellites, low cost satellite constellations and UAVs is poised to dramatically reduce the cost of EO imagery.
An exponentially increasing volume of EO imagery is being captured every day from a variety of devices. The Committee on Earth Observation Satellites (CEOS) reports 286 Earth observation devices in orbit. The next Digital Globe satellite Worldview-3 scheduled for some time in 2014 will be capable of 31 cm resolution and can store two terabits of data between downloads. Two start-up satellite companies have already started putting satellite constellations in space, that promise to provide high frequency revisits to every point of the Earth's surface at low cost. Planet Labs and Skybox Imaging each plan to launch constellations of 24+ low cost satellites that will capture high-resolution imagery of every spot on earth many times per day. They promise to deliver the first ever HD-video of any spot on earth, allowing users to track changes—from traffic jams to deforestation—in near real time.
A few years ago I remember that the total amount of imagery being downloaded daily from all EO satellites was estimated to amount to about a terabyte. Now a single satellite could be responsible for a terabyte daily. If you add advances in aerial photogrammetry and the rapid development of UAV-based photogrammetry, we are looking at huge volumes of imagery every day, probably approaching a petabyte.
All of this data requires processing before it is made available to customers and end users. Currently this processing is time consuming and a significant proportion of it stil involves manual and semi-automated processing. This results in delays between the time the data is captured and when it is made available to customers which can range from days to weeks or even months. Customers are becoming impatient with the time it takes to process data and are increasingly pushing for near real-time availability.
At the PCI Geomatics Reseller Meeting in Ottawa Wolfgang Lück, whose company Forest Sense cc customizes image processing workflows and provides remote sensing training, gave an overview of the sensor web and how automating image processing is enabling near real-time imagery availability.
Wolfgang says that most of the sensor web including different types of satellites, ground-bases sensors, mobile devices and satellite monitoring systems called sentinels are in place.
- EDRS Geostationary Relay and supercomputer
- Wide area sytematic Optic/SAR monitoring
- VHR SAR
- Optical VHR/hyperspectral
- Field sensors
- Mobile devices
- DRS (earth receiving station)
According to Wolfgang, the communications protocols and the standards (for example, the OGC Sensor Observation Service) are in place for these systems to talk to each other. Most of the satellite systems are in space, with just the sentinel systems still needing to go up. These are wide area systematic Optic/SAR monitoring systems being launched into space by ESA. On the ground there are receiving stations (DRS), field sensors, and mobile devices carried by people. The mobile devices capture as well as receive information.
Imagery requires involves complex processing to make it usable. With the huge volumes of data that the sensor web is capable of capturing this processing has to be automated if the data is to be available in near real-time at low cost. There are many actors that need to be accounted for, but the most important actors are end user customers (machines or human beings) because customers determine the products that are to be delivered.
Typical image processing workflow
There are different work flows depending on the sensors that are used and the final product requested by the customer. To provide a feel for just how complex this processing is and the challenges involved in automating these work flows, Wolfgang went through a typical work flow beginning with raw satellite imagery and following through to producing an orthorectified image suitable for classification, for example to identify different types of vegetation. Even for this pretty basic image processing work flow, there are many steps.
- Ingestion of data from pick-up point
- Binary data to supported format conversion
- Relative radiometric correction and artifact removal
- Band alignment
- DN to ToA reflectance conversion
- Haze cloud and water classification
- Auto-ground control point collection
- Orthorectification
- DSM generation
- Topographic correction
- Spectral preclassification
- Level 4 product generation
- Image compositing / mosaicking
- Delivery of data to pick-up point
In the past many of these steps were manual or semi-automated. The important breakthrough that Wolfgang was able to demonstrate is that he has been able to automate the entire process, which means that end user products, whether orthorectified images, digital surface models or classified vegetation maps are available much faster and at lower cost.
From Wolfgang's perspective the key to enabling this to happen is an image processing platform that provides a modular architecture, robust scripting, "big data" management, support for a broad range of sensor models (cameras), support for Linux as well as Windows, comprehensive image processing functionality, the ability to incorporate custom leading edge algorithms - integrated with an accessible user interface and API. In addition because this level of automation pushes the envelope, responsive support from the vendor is also essential.
In the example Wolfgang used a scripting language, Python or EASI, to implement a work flow primarily comprised of PCI-Geomatica modules, but also custom components implementing leading edge algorithms from current research.
The first step is format conversion. The data often comes in some proprietary format which the processing engine may not support. A converter is required to bring it into a format that the processing engine supports.
The next step is radiometric corrections and artifact removal. Forest Sense supports a wide range of systems put together by emerging space nations, such as African or Middle Eastern countries. In the example, you can see two lines lines are artifacts. New gains and biases have to be calculated for those lines and applied to the image. Other examples are random noise detection and dropped line removal.
Another problem is systematic noise. A fast Fourier transform (FFT) reveals low a certain pattern in the high frequency domain which is characteristic of systematic noise. We can detect that automatically, do a FFT, create a mask to filter it out, and then do a reverse FFT to reconstruct the corrected image.
The next correction is the flat field correction that ensures that the sensors are calibrated correctly relative to one another.
The next step is band alignment. The different bands need to be aligned because they are acquired at slightly different angles by different sensors.
The next step is total atmosphere reflectance correction. What the sensor actually gets is relative voltage, and that is converted to relative radiance, which need to be normalized to total atmosphere reflectance. This allows images acquired at different times, or from different sensors to be compared and to be used for quantitative analysis. But the effect of the atmosphere is still there. We can do atmospheric correction, but it is a bit of a black art and applied incorrectly can introduce artifacts. For this reason, many people in the domain prefer to use total atmospheric reflectance data because they can trust it.
At this point the image can be classified.
The first step in classifying the image is haze, water and cloud classification. To do this requires both blue and red bands. These bands correlate with one another very highly, but the blue band is affected stronger by haze and water vapour. What is called a clear sky vector can be applied to the image to calculate the haze vector and remove haze. The right of the side shows the haze that is removed from the image.
The next step is collection of ground control points (GCPs) so that accurate geolocations can be calculated. Unlike many systems which collect GCPs even over water or cloud which introduces errors, Wolfgang only collects ground control points over land.
GCPs enable automated orthorectification as well as extracting digital surface models (DSMs) automatically from stereo pairs.
The next step is preclassification. Spectral preclassification requires topographic normalization using a digital terrain model (DTM). The image has to be preclassfied into different surface types that have different BRDF (bidirectional reflectance distribution function) properties. Light reflects in different directions at different intensities on different surfaces at different wavelengths. This has to be done using bands that are not affected by topography, which makes this step quite complicated.
The slide shows a Landsat image of a hilly terrain with a lot of shade and exposed slopes which need to be normalized topographically. Traditionally people have used a range of vegetation indices, such as the normalized difference vegetation index (NDVI). From comparison with the DTM, it is possible to see that the NDVI is affected by topography. Although it is a ratio between bands, it is still possible to see shady and sunny slopes. The reason for this is that there is a lot of scattering in the blue band and shorter wavelengths but no scattering in the red and longer wavelengths so the ratio between the red band and the infrared band is not a pure ratio because the red band is affected by scattering.
The TVI index was developed to be resistant to topography. Of course it doesn't work in areas of full shade. A DSM is used to to calculate which areas are in complete shade so they can be masked out. At this point all of these techniques can be applied to generate an automatic spectral preclassification of the image which then can be used for topographic normalization. The skylight adaptive topographic normalization is based on techniques that Wolfgang developed using all of these preprocessing steps. In the corrected image it is still possible to still see some shady areas, but most of the shadow has been removed.
The next step is image compositing and mosaicing and the generation of Level 4 (geoinformation) products. These products might include vegetation indices, biophysical parameters, masks, pixel-based classifications or object-based classifications.
The final step is to package the data and deliver the data to the customer, typically by transmitting the imagery to a pickup point where it is catalogued.
At this point the product is available for the customer to collect from the pickup point.
Wolfgang demonstrated that downloading raw data and transforming it into something usable by a farmer, construction contractor, or insurance adjustor requires many steps and a lot of processing. This example of a typical image processing work flow shows just what nearly a petabyte of the raw imagery captured daily must go through to create something usable by the end user customer (people or machines). To get this to customers in near real-time is incredibly computationally intensive and involves rapidly processing massive volumes of data. This is why algorithmic performance is critical and why multiple processors with hyperthreading and most recently GPUs (graphics processing units) are being harnessed to improve throughput.