Airborne1






GeoEye

 
     
   
 

By Jackson Cothren and Bruce Gorham, University of Arkansas Center for Advanced Spatial Technologies (www.cast.uark.edu), Fayetteville, Ark.


Traditional image-content analysis methods in machine vision and photogrammetry used gray-scale and shape characteristics to extract roads, buildings, etc., while remote sensing focused on spectral signatures to classify pixels in smaller scale images.  Now feature-extraction techniques—implemented in software such as Definiens Imaging’s eCognition package (www.definiens-imaging.com)—are able to effectively combine these separate approaches to create a more robust, powerful feature-extraction capability.
 


In particular, eCognition merges pixels into homogenous regions, or objects, within an image based on “color” as well as size and shape. The resulting segments have dozens of “signatures” that can be used to extract buildings, roads, agriculture fields, tree stands and other features. To learn more about eCognition’s object-oriented analysis capabilities, researchers at the University of Arkansas Center for Advanced Spatial Technologies (CAST) recently used the software to extract impervious surfaces from DigitalGlobe QuickBird pan-sharpened satellite imagery of Fayetteville, Ark.


Beyond Pixels
The computer vision research community has known for years that meaningful information can’t be represented by individual image pixels. In fact, as early as the mid-1970s computer vision researchers were developing methods to intelligently group pixels into objects that had meaning in the real world (Rosenfeld and Kak, 1982). Image segmentation, as this grouping approach came to be known, is the first step in a process that automatically extracts and identifies features in an imaged scene. The process has been applied in many machine vision applications, ranging from inspecting and measuring parts on a conveyor belt to recognizing text and, more recently, human faces. The digital photogrammetry research community has been actively involved in applying image segmentation as a first step in automatic feature matching across high-resolution stereo images (Schenk, 2001). With products like eCognition, such technology also can be applied to remote sensing classification problems.

 

 
 

Remote sensing has concentrated on classifying individual pixels based on their reflectivity in several spectral bands represented by a like number of pixel values. Statistical and heuristic methods are used and at least partially implemented in virtually all image processing software. However, with high-resolution imagery from today’s digital airborne and satellite-based systems, these traditional techniques are limited because of the relatively small spectral range of the sensors.


This is where eCognition’s segmentation approach excels. Instead of working with pixels as the most basic element of the image, eCognition first segments the image into spectrally homogenous objects. The user can somewhat control the size and shape in the segmentation process through various weighting parameters, making the resulting objects more compact and smooth at the expense of spectral homogeneity. The resulting objects become the most basic element of the image, and each has its own “signature.” A partial list would include the mean value and standard deviation of its constituent pixels, size, perimeter, primary orientation, compactness and texture—or the degree to which a pattern is present in each band. All of these measures are available to determine what feature the image object represents.


For example, Figure 1 shows a segmented multispectral QuickBird image of a residential area. Notice the single object highlighted in red. Indeed, it corresponds to what an interpreter would consider a feature (a cul-de-sac), and some components of its signature are shown on the right. Based on these values, it would be fairly easy to automatically identify most of the other “cul-de-sac objects” in the scene.


In fact, these objects and their signatures may be used in a more-or-less conventional supervised or unsupervised classification. The extended signature enhances the power of both. But there’s more to it than an extended signature. Traditional classifications take into account only one feature (whether it’s a pixel or an image object) at a time. There’s also information contained in an object’s relationship to nearby objects. For example, one might reason that a spectrally dark object to the northwest of a bright, rectangular object identified as a building might be classified as shadow. The eCognition software provides an interface for defining these kinds of rules to aid in object classification—and, equivalently, feature extraction.

 

 
 

Impervious Surface Extraction
CAST researchers have had several opportunities to apply this object-oriented approach to feature extraction. One opportunity came in the spring and summer of 2003, when CAST staff worked with the city of Fayetteville to generate an ortho-image update of the city’s fast-growing utility service area. Six QuickBird Basic scenes were ortho-rectified and pan-sharpened to produce 1:4,800-scale, 2-foot ground sample distance (GSD) images that could be displayed as either 11-bit true-color or color-infrared composites. In addition, the city’s engineers were interested in extracting impervious surfaces from the ortho-images. Because of limited funding and the large area involved, the researchers ruled out manual extraction and chose to investigate eCognition’s capabilities.


The researchers determined it’s possible to distinguish impervious and permeable surfaces in high-resolution imagery using pixel-based classification methods with about an 80 percent success rate. In fact, the supervised classification of the QuickBird ortho-images using all four bands yielded a 78 percent accuracy rate based on a large number of independent ground truth points. Most of the failures were caused by confusion between asphalt and shadow and between bare earth and concrete.


To address these concerns, the researchers segmented the image—a portion of the result is shown in Figures 1 and 2—and from the resulting objects developed a training set classified as impervious or permeable. A supervised classification of the objects based only on this training set probably would result in only slightly better accuracy. So, in addition to the class membership rules generated by the training set, the researchers identified two additional rules to help address the difficulty of separating shadow from asphalt and classify the confused objects.

 

 
 
Obviously, most shadowed portions of the scene can’t be reliably identified as either impervious or impermeable—although it might be possible to suggest a class based on a surrounding object—and should therefore be classified as such. To help recognize shadows, the researchers developed one rule, or “membership function,” in eCognition that stated an object is more likely to be asphalt paving or shingles if it’s near other impervious objects and reflects that asphalt is more likely to be present in built-up areas.


Another rule considers the size of the object relative to its permeable neighbors—the larger its relative size, the less likely it is to be asphalt. This reflects the likelihood that areas in shadow caused by tree stands are much smaller than the area of the stand. In Figure 3, notice that the “membership function” is a curve. This illustrates another eCognition tool: fuzzy classification. An object may be assigned some membership probability in several classes using different models.


As shown in Figure 4, this fuzzy classification also was used to help distinguish permeable bare soil from impervious surface based on the amount of an object’s infrared reflectance. As the pixel value in the near-infrared band increased, so did the probability of an object being bare solid.

 

 
  These two rules increased the classification accuracy from 79 percent to nearly 90 percent, primarily due to the ability to more accurately classify shadow objects. Bare soil confusion was still present, but the fuzzy classification at least quantified the confusion and allowed the researchers to vary the threshold to identify problem areas.


These aren’t the only possible rules, nor are they the best set of rules. However, they do illustrate the power eCognition users have to identify relationships between classes and use them to their advantage.

 
 
   
 
 

Future Directions
CAST researchers are still working with eCognition to better understand its capabilities and applications. One obvious direction is to incorporate ever more available Light Detection and Ranging (LiDAR) elevation data into the segmentation process. Researchers have shown how elevation data may be integrated into the segmentation process to more effectively extract buildings in aerial or high-resolution satellite images.


Another related application is to incorporate terrestrial LIDAR point-cloud information, along with color digital images, into the segmentation process. Figure 5 shows a portion of a digital image of a building that has been co-registered with a 3-D point cloud collected with an Optech ILRIS-3D laser scanner and segmented with eCognition. Notice how individual bricks and concrete features are easily extracted and identified. This kind of data analysis offers capabilities that neither photogrammetry nor point-cloud analysis alone could provide.

 
   
  References
Rosenfeld, A. and Kak, A. C. 1982. Digital Picture Processing, Volume 2. Academic Press Inc.

Schenk, T. 2001. Digital Photogrammetry, Volume 1. TerraScience.

 

 

 
  See more Featured Articles
 

  See  Featured Images
 
  Subscribe to Earth Imaging Journal

 
Go to Home Page
      

 

  [none]

Copyright ©2003-2007 Earthwide Communications LLC - Powered by eNetwork Marketing