Example-Based Classification

Feature Extraction with Example Based Classification Tutorial

Example-based, or supervised, classification is the process of using training data to assign objects of unknown identity to one or more known features. The more features and training samples you select, the better the results from supervised classification.

You must have an ENVI Feature Extraction license in order to use this tool.

See the following sections:

From the Toolbox, select Feature Extraction > Example Based > Feature Extraction Workflow. The Data Selection panel appears.

Select Input Files for Feature Extraction

For best results with Feature Extraction, consider preprocessing your imagery to reduce noisy or redundant data, to correct for atmospheric effects, or to suppress vegetation. See Preprocess Imagery for available options.

For hyperspectral imagery, we strongly recommend that you run a principal components analysis or independent components analysis on the dataset before using it in Feature Extraction. Segmentation and merging work best on datasets with only a few bands.

Also consider reducing the spatial resolution of your input image to speed up processing and to remove small, unwanted features. For example, you can down-sample a 10,000 by 10,000 pixel image by a factor of 10 to yield a 1,000 by 1,000 pixel image.

  1. Click Browse and select a panchromatic or multispectral image for input using the Data Selection dialog. Feature Extraction accepts any image format listed in Supported Data Types.
  2. To apply a mask, select the Input Mask tab in the File Selection panel. In addition to the mask, any pixel values specified in the Data Ignore Value field of the associated header (for ENVI-format files) will be treated as mask values and will not be processed by Feature Extraction.
  3. You can import ancillary data to help extract features of interest. An example is combining a LiDAR digital surface model (DSM) with a multispectral image to identify rooftops in a residential area, then building a rule using height data from the DSM to more accurately extract the rooftops. (The height data would be in the Spectral Mean attribute for the DSM band.) Multiple datasets often provide more accurate results. The following rules apply:
    • You can only use raster data for ancillary data. Vector data must be converted to raster format prior to import.
    • The ancillary image file must be georeferenced to a standard or Rational Polynomial Coefficient (RPC) spatial reference. If the ancillary data is not in the same map projection as the input image, ENVI will reproject the ancillary data to match the base projection. Images with a pseudo spatial reference cannot be used as ancillary data.
    • The ancillary data and input image must have some geographic overlap.
    • If you spatially subset the input image, the ancillary data will be reprojected to match that spatial extent.
  4. Select the Ancillary Data tab and click Add Data. Select one or more ancillary files for input. You can select spectral subsets from each ancillary data file. ENVI will create new bands for each ancillary file that you import; you can then use these bands for rule-based classification. The ancillary bands are identified by the name of the ancillary file and the respective band number of that file.

  5. Select the Custom Bands tab and enable the following options if desired. The input image must be georeferenced to a standard map projection for these options to be available.
    • Normalized Difference: Select two bands for computing a normalized band ratio as follows:
    • [(b2 - b1) / (b2 + b1 + eps)]

      Where "eps" is a very small number to avoid division by zero.

      If b2 is near-infrared and b1 is red, then Normalized Difference will be a measure of normalized difference vegetation index (NDVI).

      For example, if you have a QuickBird image with four bands where Band 3 is red and Band 4 is near-infrared and you want to compute NDVI, select Band 3 from the Band 1 drop-down list. Select Band 4 from the Band 2 drop-down list.

      ENVI will create a "Normalized Difference" band that you can use for segmentation or classification.

    • Color Space: Select the Red, Green, and Blue band names from the image. ENVI will perform an RGB to HSI color space transformation and will create new bands for Hue, Saturation, and Intensity that you can use for segmentation or rule-based classification.
    • Hue: Often used as a color filter, measured in degrees from 0 to 360. A value of 0 is red, 120 is green, and 240 is blue.

      Saturation: Often used as a color filter, measured in floating-point values that range from 0 to 1.0.

      Intensity: Often provides a better measure of brightness than the Spectral_Mean spectral attributes. Intensity is measured in floating-point values that range from 0 to 1.0.

  6. Note: You should not perform segmentation with a combination of custom bands (normalized difference or HSI color space) and visible/NIR bands. You can perform segmentation on the normalized difference or color space bands by themselves, but not in combination with visible and NIR bands.

  7. Click Next.

ENVI will create a single dataset from the combined bands of the input image, ancillary data, normalized difference, hue, saturation, and intensity (if selected). This single dataset will be used throughout the rest of the Feature Extraction workflow.

When file selection is complete, the file opens in a new workflow view. If the selected file is displayed in an active view before you start the workflow, the display bands and image location are retained, as well as any brightness, contrast, stretch, and sharpen settings. The image location is not retained for pixel-based images or those with pseudo or arbitrary projections.

Segment Images

Segmentation is the process of partitioning an image into objects by grouping neighboring pixels with common values. The objects in the image ideally correspond to real-world features. Effective segmentation ensures that classification results are more accurate.

  1. Enable the Preview option in the Object Creation panel. A Preview Window appears with segments outlined in green.
  2. Under Segment Settings, select an Algorithm from the drop-down list provided. The following options are available:
    • Edge: Best for detecting edges of features where objects of interest have sharp edges. Set an appropriate Scale Level and Merge Level (see steps below) to effectively delineate features.
    • Intensity: Best for segmenting images with subtle gradients such as digital elevation models (DEMs) or images of electromagnetic fields. When selecting this method, don't perform any merging; set the Merge Level to 0. Merging is used primarily to combine segments with similar spectral information. Elevation and other related attributes are not appropriate for merging.
  3. See Watershed Algorithm Background for more detailed descriptions of each option.

  4. Adjust the Scale Level slider as needed to effectively delineate the boundaries of features as much as possible without over-segmenting the features. Increasing the slider results in fewer segments; decreasing the slider results in more segments. You should also ensure that features of interest are not grouped into segments represented by other features. See Watershed Algorithm Background for a more detailed discussion of how the Scale Level is used with respect to gradient and intensity images.
  5. Click the Select Segment Bands button to choose specific bands for applying the segmentation settings. The settings will apply to a grayscale image derived from the average of all selected bands. All available bands are selected by default.
  6. Tip: For best segmentation results, select a combination of bands that have similar spectral ranges such as R, G, B, and NIR bands. You should not perform segmentation with a combination of custom bands (normalized difference or HSI color space) and visible/NIR bands. You can perform segmentation on the normalized difference or color space bands by themselves, but not in combination with visible and NIR bands.

  7. Merging combines adjacent segments with similar spectral attributes. Under Merge Settings, select an Algorithm from the drop-down list provided. The following options are available:
    • Full Lambda Schedule: (default). Merges small segments within larger, textured areas such as trees or clouds, where over-segmentation may be a problem.
    • Fast Lambda: Merges adjacent segments with similar colors and border sizes.
  8. See Merge Algorithms Background for more detailed descriptions of each option.

  9. Adjust the Merge Level slider as needed to combine segments with similar colors (Fast Lambda) or to merge over-segmented areas (Full Lambda Schedule). Increasing the slider results in more merging; no merging will occur if you leave the slider value at 0. For example, if a red building consists of three segments, selecting Fast Lambda and increasing the Merge Level should combine them into one segment. All available bands are selected by default. To delineate treetops or other highly textured features, select Full Lambda Schedule and increase the Merge Level value.
  10. Click the Select Merge Bands button to choose specific bands for applying the merge settings. Merging will be based on the differences between region colors based on all selected bands.
  11. Select a Texture Kernel Size value, which is the size (in pixels) of a moving box centered over each pixel in the image. Texture attributes are computed for each kernel. Enter an odd number of 3 or higher. The maximum value is 19. The default value is 3. Select a higher kernel size if you are segmenting large areas with little texture variance such as fields. Select a lower kernel size if you are segmenting smaller areas with higher variance such as urban neighborhoods.
  12. Click Next. ENVI loads a segmentation image into the display.

Select Training Data

The Example-Based Classification panel contains a folder called All Classes that will contain all of the feature types (classes) that you define. A new, undefined class is available for you to start defining training data.

  1. Select the new class, and edit its name and color within the Class Properties table.
  2. As you move around the segmentation image, the regions underneath your cursor are highlighted in cyan. Click on a highlighted region to assign it to that class. The color of the region changes to the feature color, and the class name updates to show the number of training regions you added.
  3. Tip: Suppose you created a new class called "Vegetation." Move your cursor around the image and highlight a region that you know represents vegetation. Click on the region to select it. To view the original image instead of the segmentation image, set the Transparency slider in the main toolbar to 100% transparency.

    The color variations of the segments in the segmentation image may be so small that you cannot discern them. Enable the Show Boundaries option to outline the segments so they are easier to visualize.

    Continue selecting training regions that best represent your class. Try to select regions with different textures, shades, and sizes.

    Another way to select training regions is to draw a box around a group of adjacent segments; however selecting a large number of segments could result in slower processing.

Save Training Data

To save your training data for all classes to a point shapefile, click the Save Example File button. ENVI saves the pixel coordinates of the points where you clicked when selecting training regions, along with map coordinates if the image is georeferenced.

A training data shapefile only contains the locations of training regions; it does not include other parameters you selected for example-based classification such as the classification method, selected attributes, etc.

You can later restore the training data file by clicking the Restore Example File button. The restored data will overwrite any training data you have defined in the curent session. You cannot restore a pixel-based training data file for use with a georeferenced image. After restoring the file, you can click Back to readjust your segmentation settings if needed, but the training regions will change based on the new segmentation result.

Import Ground Truth Data

Ground truth data defines areas of an image with known feature types, thus it represents a true classification for specific areas of the image. You can import ground truth data from the following sources when performing supervised classification:

Tip: An example of ground truth data is a land-cover classification shapefile or geological map of your area of interest, which may be available from websites of local government agencies.

  1. Under the Examples Selection tab, click the Import Ground Truth button. This button is enabled for georeferenced input images only.
  2. Importing a ground truth file will overwrite any existing training samples that you have collected. When prompted to disregard the currently loaded example file, click Yes. The Data Selection dialog appears.
  3. Click Open File and select a 2D or 3D point or polygon shapefile with ground truth data. Click OK. The Select Attribute dialog appears.
  4. From the Select Attribute drop-down list, select the shapefile attribute to group vector records into training classes. The default is CLASS_ID. You should choose unambiguous attribute fields for grouping records into training classes.
  5. Regions in the segmentation image that overlap any vector records in the shapefile will become training samples for the corresponding class. Vector records that do not overlap the input image will be ignored. If you select Use Centroid, a region will become a training region only if the centroid of the polygon (instead of any part of the polygon) falls within the region. A centroid is the geometric center of a polygon.
  6. Click Import. A new class will be created for each unique CLASS_ID or other attribute value that you choose.

Define Class Colors

You can specify class colors by adding a CLASS_CLRS attribute to the ground truth shapefile. Use this attribute field to specify class colors using a string of RGB values (for example, enter '255,0,0' for Red). If the shapefile contains CLASS_NAME and CLASS_CLRS attributes and you select the CLASS_ID attribute from the Select Attribute drop-down list, the proper class names and colors will be restored. If the shapefile does not contain CLASS_NAME and CLASS_CLRS attributes, ENVI will assign unique names and colors to each class. You can rename them in the Example-Based Classification panel if desired.

Define Multiple Classes from One Shapefile

You can use a single 2D or 3D polygon shapefile to define multiple classes. An example is using the Classification workflow to draw polygons in an image that represent different classes, then saving the polygons to a shapefile:

In this case, the shapefile will contain a CLASS_ID attribute with the values of each class:

Import this shapefile as training data into the Example-Based Feature Extraction workflow using the steps above.

You do not have to use the Classification workflow to create a training data shapefile; it can come from any source as long as it has a CLASS_ID attribute defined. You can add or edit classes in the CLASS_ID field as needed. For example, suppose that you defined some regions of interest (ROIs) with several hundred polygons representing training regions, but you only want 10 classes. Export the ROIs to a shapefile, open the shapefile in ENVI, open the Attribute Viewer, and group the polygon ROIs into 10 classes by editing the CLASS_ID values. Then import the shapefile as training data in the Example-Based Classification workflow.

Select Attributes for Classification

This section describes how to choose attributes that will be used to classify your training samples. Click the Attributes Selection tab to see the available options. By default, all attributes will be used for classification.

Note: This section is disabled for PCA classification since all attributes are used with that method.

Reference

An interval based attribute ranking technique. Unpublished report, NV5 Geospatial Solutions, Inc.. A copy of this paper is available from Technical Support.

Select a Classification Method (Advanced)

Three methods are available to perform supervised classification. Click the Algorithms tab and select a method from the Algorithms drop-down list. See one of the following sections:

Enable the Allow Unclassified option to allow segments to be unclassified when the classifier cannot determine suitable classes for them. This option is enabled by default.

After the KNN, PCA, or SVM method runs, each segment is assigned the class with the highest class confidence value. Segments with class confidence values less than the percentage you set with the Threshold slider are assigned to "unclassified." The default Threshold value is 5 percent, which means segments that have less than 5 percent confidence in each class are set to "unclassified."

As you increase the Threshold slider, the classifier will allow more unclassified segments. As you decrease the value of the Threshold slider, the classifier forces more segments into classes. A value of 0 means that all segments will be classified, except with the PCA method where some unclassified regions may remain.

Enable the Preview option to view classification results within a Preview Window. You can preview the effects of changing classification options before classifying the entire image. If either the image lines or samples is greater than 1024 pixels and you want to zoom out of the data, you cannot zoom out further than 50% because it will significantly increase processing time and delay the previewed data from displaying. Zooming out further than 50% will result in a black Preview Window.

Export Classification Results

In this step, you will select the types of images, vectors, and statistics to export to various formats. By default, files are saved to the directory that you specify in the Output Directory preference. When the export is complete, the workflow view closes. The original data and the export data display in the Image window view. The available output options vary, depending on your workflow.

Rule-Based and Example-Based Classification

Choose the classification file types you want to save.

Export Vector Tab

Export Raster Tab

Advanced Export Tab

Auxiliary Export Tab

Segment-Only

Export Vector Tab

Export Raster Tab

Advanced Export Tab

Auxiliary Export Tab