ENVI_FX_EXAMPLEBASED_DOIT

This procedure automates the Example Based Feature Extraction workflow, using an imported training data file (.shp).

Based on the segmentation options that you specify, ENVI will compute spatial, spectral, and texture attributes and will generate the following optional images: segmentation, attribute, rule confidence, and classification.

You must have an ENVI Feature Extraction license in order to use this routine.

Syntax

ENVI_DOIT, 'ENVI_FX_EXAMPLEBASED_DOIT' [, A_FID=array] [, A_POS=array] [, /ALLOW_UNCLASSIFIED] [, ATTRIBUTE_RASTER_FILENAME=string] [, BR_BANDS=array] [, CLASSIFICATION_RASTER_FILENAME=string] [, CLASSIFICATION_THRESHOLD=floating point] [, CLASSIFY_ALGORITHM=string] [, CONFIDENCE_RASTER_IMAGE=string] [, CS_BANDS=array] [, DEGREE=integer] [, DIMS=array] [, /DO_AUTO_ATTRIBUTE] [, /DO_MERGE] [, EXAMPLE_VECTOR_FILENAME=string] [, /EXPORT_VECTOR_ATTRIBUTES], FID=file ID [, /INVERSE_MASK] [, KERNEL_BIAS=floating point] [, KERNEL_GAMMA=floating point] [, KERNEL_SIZE=long integer] [, KNN_NEIGHBORS=integer] [, KERNEL_TYPE=string] [, M_FID=file ID][, MERGE_ALGORITHM=string] [ , MERGE_BANDS=array] [, MERGE_LEVEL=floating point] [, PENALTY=floating point] [, POS=array] [, R_FID=variable] [, REPORT_FILENAME=string] [, SCALE_LEVEL=floating point] [, SEGMENT_ALGORITHM=string], SEGMENTATION_RASTER_FILENAME=string [, SEGMENT_BANDS=array] [, VECTOR_FILENAME=string]

Keywords

A_FID (optional)

Set this keyword to an array of file IDs for ancillary data files. Use this keyword in conjunction with the A_POS keyword. The number of elements of A_FID must equal the number of elements of A_POS.

A_POS (optional)

Set this keyword to an array of long integers representing band numbers to process in the ancillary data files. Use this keyword in conjunction with the A_FID keyword. Specify bands starting with zero (Band 1 = 0, Band 2 = 1, etc.)

Each element in the A_POS array corresponds to an element in the A_FID array. For example, suppose you have two ancillary files whose file IDs are 100 and 200. To process Band 1 from file ID 200, and to process Bands 2-5 from file 100, write the code as follows:

A_FID = [200, 100, 100, 100, 100]

A_POS = [0, 1, 2, 3, 4]

To process all bands from the same file (100), write the code as follows:

A_FID = [100, 100, 100, 100, 100]

A_POS = [0, 1, 2, 3, 4]

ALLOW_UNCLASSIFIED (optional)

This keyword is set by default to allow segments to be unclassified when the classifier cannot determine suitable classes for them.

ATTRIBUTE_RASTER_FILENAME (optional)

Set this keyword to a string with the name of the output attribute image, which is a multi-layer image in ENVI raster format where each layer represents the values of a specific attribute. All attributes are included.

BR_BANDS (optional)

Set this keyword to a two-element array of long integers that represent bands used to compute an optional Normalized Difference index to use in segmentation. The Normalized Difference index is computed as follows:

[(b2 - b1) / (b2 + b1 + eps)]

Where "eps" is a very small number to avoid division by zero.

Note: Set BR_BANDS using the following order: [b1, b2]. Band numbers start with 0.

If b2 is near-infrared and b1 is red, then Normalized Difference will be a measure of normalized difference vegetation index (NDVI). ENVI will create a "Normalized Difference" band that you can use for segmentation or classification.

For example, multispectral QuickBird images have the following band definitions:

Band number	Wavelength Center	Zero-based band number
1	Blue (485 nm)	0
2	Green (560 nm)	1
3	Red (660 nm)	2
4	Near-infrared (830 nm)	3

To compute NDVI using QuickBird imagery, set BR_BANDS=[2,3].

CLASSIFICATION_RASTER_FILENAME (optional)

Set this keyword to a string with the name of the output classification image. This type of image is in ENVI raster format, and the pixel values represent different classes.

CLASSIFICATION_THRESHOLD (optional)

Set this keyword to a floating-point value between 0 and 100, indicating the confidence threshold. See Select a Classification Method (Advanced) for more information on the threshold value. The default value is 5.0 (5), which means segments that have less than 5 percent confidence in each class are set to "unclassified."

CLASSIFY_ALGORITHM (optional)

Set this keyword to a string indicating the supervised classification method to use:

KNN: (default) This method classifies segments based on their proximity to neighboring training regions. See Background on K Nearest Neighbor for more information.
PCA: This method assigns segments to classes using a principal components analysis. You must define at least two classes with a minimum of two training regions each. See Background on Principal Components Analysis for more information.
SVM: This is the most rigorous of the three classification methods, so processing time will be slower. See Background on Support Vector Machine for more information. Each class must have at least two training samples; otherwise, the class will be excluded from classification.

CONFIDENCE_RASTER_IMAGE (optional)

Set this keyword to a string with the name of the output confidence image. This type of image is in ENVI raster format and shows the relative confidence of each object belonging to a class. The higher the brightness of an object, the higher the confidence that the object belongs to the class. If an object is very dark, it likely does not belong to the class. This is a multi-layer file, with each layer representing one class.

CS_BANDS (optional)

Set this keyword to a three-element array of long integers that represent the red, green, and blue bands from the input image. Bands are zero-based, so Band 1=0, Band 2=1, etc. These will be used to perform an optional RGB to HSI color space transformation that will create Hue, Saturation, and Intensity bands to use in segmentation.

DEGREE (optional)

This keyword only applies to the SVM classification method. If you set KERNEL_TYPE to POLYNOMIAL, then set DEGREE to an integer value indicating the degree of kernel polynomial. See Background on Support Vector Machine for more information.

DIMS (optional)

The “dimensions” keyword is a five-element array of long integers that defines the spatial subset (of a file or array) to use for processing. Nearly every time you specify the keyword FID, you must also specify the spatial subset of the corresponding file (even if the entire file, with no spatial subsetting, is to be processed).

DIMS[0]: A pointer to an open ROI; use only in cases where ROIs define the spatial subset. Otherwise, set to -1L.
DIMS[1]: The starting sample number. The first x pixel is 0.
DIMS[2]: The ending sample number
DIMS[3]: The starting line number. The first y pixel is 0.
DIMS[4]: The ending line number

To process an entire file (with no spatial subsetting), define DIMS as shown in the following code example. This example assumes you have already opened a file using ENVI_SELECT or ENVI_PICKFILE:

envi_file_query, fid, dims=dims

DO_AUTO_ATTRIBUTE (optional)

Set this keyword to have ENVI determine the best attributes to use for classifying features. You must define at least two classes with at least two training samples each. The more classes and training samples you have, the slower the process.

DO_MERGE (optional)

This keyword is set by default to merge adjacent features in the polygon shapefile specified in the VECTOR_FILENAME keyword.

Perform merging if you are confident that adjacent polygons belong to the same class and you want to consolidate them into a single polygon. This results in a smaller file size. This option merges all adjacent segments at once across the entire image; you cannot select specific polygons to merge.

Tip: A good example is an angled rooftop that reveals different brightness levels from an aerial sensor, depending on the sun's angle. The segmentation step would typically create multiple regions within the rooftop, each with different spectral values. But if you build a good rule set that identifies rooftops, the classification image (and/or shapefile) will assign these regions to the same class. Since you know that they all belong to one rooftop, you can choose to merge the adjacent segments so that the entire rooftop is one polygon.

EXAMPLE_VECTOR_FILENAME (optional)

Set this keyword to a string with the filename of a training data file to import. The training data file must be a point shapefile (.shp). The shapefile contains pixel coordinates of the points where training regions were previously selected in an interactive session of ENVI, along with map coordinates if the image is georeferenced.

A training data shapefile only contains the locations of training regions; it does not include other parameters you selected for example-based classification such as the classification method, selected attributes, etc.

EXPORT_VECTOR_ATTRIBUTES

This keyword is set by default to include the spatial, spectral, and texture attributes that were computed for each region, when creating a shapefile of classification results.

FID

The file ID (FID) is a long-integer scalar with a value greater than 0. An invalid FID has a value of -1. The FID is provided as a named variable by any routine used to open or select a file. Often, the FID is returned from the keyword R_FID in the ENVIRasterToFID routine. Files are processed by referring to their FIDs. If you work directly with the file in IDL, the FID is not equivalent to a logical unit number (LUN).

INVERSE_MASK (optional)

Set this keyword to invert the mask specified by the M_FID keyword. Inverting the mask means that Feature Extraction will process regions with pixel values of 0 in the mask.

KERNEL_BIAS (optional)

This keyword only applies to the SVM classification method. Set this keyword to a floating-point value, indicating the bias used in the kernel function. The default value is 1.0. See Background on Support Vector Machine for more information.

KERNEL_GAMMA (optional)

This keyword only applies to the SVM classification method. Set this keyword to a floating-point value greater than 0.01, indicating the gamma parameter used in the kernel function. The default value is the inverse of the number of computed attributes. See Background on Support Vector Machine for more information.

KERNEL_SIZE (optional)

Set this keyword to a long-integer odd number representing the size of the kernel used in texture attribute calculations. The default value is 3.

KERNEL_TYPE (optional)

This keyword only applies to the SVM classification method. Set this keyword to one of the following string values, indicating the kernel type to use. See Background on Support Vector Machine for more information.

Radial Basis (default)
Polynomial
Sigmoid

KNN_NEIGHBORS (optional)

This keyword only applies to the KNN classification method. Set this keyword to an odd integer, indicating the number of neighboring training regions to consider in classification. See Background on K Nearest Neighbor for more information.

M_FID (optional)

Set this keyword to the file ID of a raster mask image. If you specify this keyword, Feature Extraction will ignore regions with pixel values of 0 in the mask.

MERGE_ALGORITHM (optional)

Set this keyword to one of the following strings, specifying the method used to perform merging:

Full Lambda Schedule: (default). Merges small segments within larger, textured areas such as trees or clouds, where over-segmentation may be a problem.
Fast Lambda: Merges adjacent segments with similar colors and border sizes.

See Merge Algorithms Background for more detailed descriptions of each option.

MERGE_BANDS (optional)

Set this keyword to an array of long integers that represent band numbers to use with the merge algorithm and merge level that you specify. Merging will be based on the differences between region colors on all selected bands. By default, all bands of the input image will be used.

If you selected a spectral subset of bands using the POS keyword, then specify band numbers according to the POS array, not the original dataset. For example, suppose that you created a spectral subset of bands 1, 2, 5, and 7 from a 7-band dataset. The POS array for these four bands is [0, 1, 2, 3]. To perform merging on bands 1, 2, and 5 from the original dataset, then use the corresponding POS band numbers as follows:

MERGE_BANDS=[0, 1, 2]

If you do not select a spectral subset, then you do not have to keep track of the POS band numbering, as in this example. By default, the MERGE_BANDS keyword will use all bands from the input image.

MERGE_LEVEL (optional)

Set this keyword to a floating-point value between 0 and 100.0, specifying the merge level used to combine segments with similar colors (Fast Lambda method) or to merge over-segmented areas (Full Lambda Schedule method). The default value is 0.

PENALTY (optional)

This keyword only applies to the SVM classification method. Set this keyword to a floating-point value greater than 0.01, indicating the penalty parameter to use. The default value is 100.0. See Background on Support Vector Machine for more information.

POS (optional)

Use this keyword to specify an array of band numbers used to perform segmentation. POS indicates the spectral subset of bands to use in processing. Specify bands starting with zero (Band 1 = 0, Band 2 = 1, etc.)

ENVI creates a single dataset from the combined bands of the input image, ancillary data, normalized difference, hue, saturation, and intensity (if selected). For best results, you should not perform segmentation with a combination of custom bands (normalized difference or HSI color space) and visible/NIR bands. You can perform segmentation on the normalized difference or color space bands by themselves. So in most cases, you will need to select a spectral subset for segmentation instead of using all bands.

Specify band numbers in the following order:

Input image: one or more bands
Ancillary image(s): one or more bands
Normalized difference: one band
Hue: one band
Saturation: one band
Intensity: one band

For example, if the input image has four bands and you want to perform segmentation on these bands only, set POS = [0, 1, 2, 3].

Or, suppose you have a 4-band image and you set the BR_BANDS and CS_BANDS keywords. The band numbers are as follows:

Input image: [0, 1, 2, 3]
Normalized difference: [4]
Hue: [5]
Saturation: [6]
Intensity: [7]

To perform segmentation on the saturation band only, set POS = [6] in this example.

For hyperspectral imagery, we strongly recommend that you run a principal components analysis or independent components analysis on the dataset before using it in Feature Extraction. Segmentation and merging work best on datasets with only a few bands, plus it helps you keep track of the band numbers when setting the POS keyword.

R_FID (optional)

This keyword is a returned variable containing the file ID of the segmentation image. If processing fails for any reason, then R_FID = -1.

REPORT_FILENAME (optional)

Set this keyword to a string indicating the filename of a text report that lists the segmentation and merge settings used, the input and ancillary files used, a list of attributes that were computed for the segmentation image, and any output files that were created.

SCALE_LEVEL (optional)

Set this keyword to a floating-point value between 0 and 100.0, indicating the scale level used to delineate features of interest in segmentation. Increasing the value reduces the number of segments. The default value is 50.0 for the Edge segmentation method and 0 for the Intensity method.

SEGMENT_ALGORITHM (optional)

Set this keyword to one of the following strings, indicating the segmentation method to use:

Edge (default): Best for detecting edges of features where objects of interest have sharp edges. Set the SCALE_LEVEL and MERGE_LEVEL as needed to best delineate features of interest.
Intensity: Best for segmenting images with subtle gradients such as digital elevation models (DEMs) or images of electromagnetic fields. When selecting this method, don't perform any merging; set MERGE_LEVEL=0. Merging is used primarily to combine segments with similar spectral information. Elevation and other related attributes are not appropriate for merging.

See Watershed Algorithm Background for more detailed descriptions of each option.

SEGMENTATION_RASTER_FILENAME

Set this keyword to a string indicating the filename of the output segmentation image. This is an image in ENVI raster format that shows the regions defined by segmentation; each region is assigned the mean spectral values of all the pixels that belong to that region.

SEGMENT_BANDS (optional)

Set this keyword to an array of long integers representing the band numbers to use in the segmentation method and scale level that you specify. All bands from the input image are selected by default. The settings will apply to a grayscale image derived from the average of all selected bands. For best segmentation results, select a combination of bands that have similar spectral ranges such as red, green, blue, and near-infrared bands.

If you selected a spectral subset of bands using the POS keyword, then specify band numbers according to the POS array, not the original dataset. For example, suppose that you created a spectral subset of bands 1, 2, 5, and 7 from a 7-band dataset. The POS array for these four bands is [0, 1, 2, 3]. To perform segmentation on bands 1, 2, and 5 from the original dataset, then use the corresponding POS band numbers as follows:

SEGMENT_BANDS=[0, 1, 2]

If you do not select a spectral subset, then you do not have to keep track of the POS band numbering, as in this example. By default, the SEGMENT_BANDS keyword will use all bands from the input image.

VECTOR_FILENAME (optional)

Set this keyword to a string indicating the filename of a polygon shapefile of classification results. If you do not specify this keyword, then no shapefile will be created.

Example

This example uses files are available from our ENVI Tutorials web page. Click the Feature Extraction link to download the .zip file to your machine, then unzip the files to the data directory of the ENVI installation:

Windows: C:\Program Files\INSTALL_DIR\ENVIxx\data

Linux: /usr/local/INSTALL_DIR/envixx/data

Mac: Applications/INSTALL_DIR/envixx/data

This example imports a training data shapefile containing four classes, creates several output files, and displays the resulting classification image.

PRO FX_EXAMPLEBASED_DOIT

compile_opt IDL2

;

; Initialize ENVI and send all errors

; and warnings to the file batch.txt

;

e = ENVI(/HEADLESS)

temp_dir = e.GetPreference('TEMPORARY_DIRECTORY')

e.LOG_FILE = temp_dir+'batch.txt'

;

; Open the input file

;

file = FILEPATH('qb_colorado.dat', $

ROOT_DIR=e.ROOT_DIR, $

SUBDIRECTORY = ['data'])

raster = e.OpenRaster(file)

fid = ENVIRasterToFID(raster)

;

; Verify FID is valid

;

IF (FID eq -1) THEN BEGIN

e.Close

RETURN

ENDIF

;

; Open training data shapefile

;

training_file = FILEPATH('qb_colorado_examplebased_training.shp', $

ROOT_DIR=e.ROOT_DIR, $

SUBDIR = ['data'])

;

; Set output filenames

;

report_filename = temp_dir+'report.txt'

confidence_raster_filename = temp_dir+'confidence.dat'

classification_raster_filename = temp_dir+'classes.dat'

segmentation_raster_filename = temp_dir+'segmentation.dat'

vector_filename = temp_dir+'classes.shp'

;

; Set the keywords.

dims = [-1L, 0, raster.ncolumns-1, 0, raster.nrows-1]

pos = lindgen(raster.nbands); process all bands

;

; Perform example-based classification

;

envi_doit, 'envi_fx_examplebased_doit', $

fid=fid, pos=pos, dims=dims, $

r_fid=r_fid, merge_level=85.0, $

scale_level=50.0, $

br_bands=[2,3], $

segment_bands=[3], $

classiication_algorithm='KNN', $

knn_neighbors=3, $

example_vector_filename=training_file, $

segmentation_raster_filename=segmentation_raster_filename, $

report_filename=report_filename, $

confidence_raster_image=confidence_raster_filename, $

classification_raster_filename=classification_raster_filename, $

vector_filename=vector_filename

;

; Exit ENVI

e.Close

;

; Re-open ENVI in interactive mode and display

; the classification image.

;

e = ENVI()

raster = e.OpenRaster(classification_raster_filename)

view = e.GetView()

layer = view.CreateLayer(raster)

view.Zoom, /FULL_EXTENT

;

END