TrainExtraTrees Task

This task implements a meta estimator that fits several randomized decision trees (i.e., extra-trees) on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

For background on the algorithm used, see Extra Trees Classification.

Example

; Start the application

e = ENVI()

 

; Open an input raster file

RasterFile = Filepath('qb_boulder_msi', Subdir=['data'], $

  Root_Dir=e.Root_Dir)

Raster = e.OpenRaster(RasterFile)

 

; Open an input ROI file

ROIFile = Filepath('qb_boulder_roi.xml', Subdir=['data'], $

Root_Dir=e.Root_Dir)

ROI = e.OpenROI(ROIFile)

 

; Get the statistics task from the catalog of ENVITasks

StatsTask = ENVITask('NormalizationStatistics')

 

; Define inputs

StatsTask.INPUT_RASTERS = Raster

 

; Run the task

StatsTask.Execute

 

; Get the data prep task from the catalog of ENVITasks

DataPrepTask = ENVITask('MLTrainingDataFromROIs')

 

; Define inputs

DataPrepTask.INPUT_RASTER = Raster

DataPrepTask.INPUT_ROI = ROI

DataPrepTask.BACKGROUND_LABELS = ['Disturbed Earth', 'Water']

DataPrepTask.NORMALIZE_MIN_MAX = StatsTask.Normalization

DataPrepTask.Execute

 

; Get the training task from the catalog of ENVITasks

TrainTask = ENVITask('TrainExtraTrees')

 

; Define inputs

TrainTask.INPUT_RASTER = DataPrepTask.OUTPUT_RASTER

TrainTask.NUM_ESTIMATORS = 100

 

; Run the task

TrainTask.Execute

 

; Output model metadata

outputModelUri = TrainTask.OUTPUT_MODEL_URI

print, 'Model URI: ' + outputModelUri

 

outputModel = TrainTask.OUTPUT_MODEL

print, outputModel.Attributes, /IMPLIED

Syntax

Result = ENVITask('TrainExtraTrees')

Input properties (Set, Get): BALANCE_CLASSES, CUSTOM_MAX_FEATURES, INPUT_RASTERS, MAX_DEPTH, MAX_FEATURES, MODEL_NAME, MODEL_DESCRIPTION, NUM_ESTIMATORS, OUTPUT_MODEL_URI

Output properties (Get only): OUTPUT_MODEL

Properties marked as "Set" are those that you can set to specific values. You can also retrieve their current values any time. Properties marked as "Get" are those whose values you can retrieve but not set.

Methods

This task inherits the following methods from ENVITask. See the ENVITask topic in ENVI Help.

Properties

This task inherits the following properties from ENVITask:

COMMUTE_ON_DOWNSAMPLE

COMMUTE_ON_SUBSET

DESCRIPTION

DISPLAY_NAME

NAME

REVISION

See the ENVITask topic in ENVI Help for details.

This task also contains the following properties:

BALANCE_CLASSES (optional)

Specify whether all classes should be considered equal during training. This helps to account for classes with few samples compared to classes with many samples.

CUSTOM_MAX_FEATURES (optional)

Specify the number of features to consider when looking for the best split. This parameter accepts a float or integer value. If specified, this value will override MAX_FEATURES.

INPUT_RASTERS (required)

Specify one or more preprocessed training rasters to be used for training.

MAX_DEPTH (optional)

Specify the number of decision trees to use. The estimators are the predictors of the algorithm. The default is 100.

MAX_FEATURES (optional)

Specify the number of features to consider when looking for the best split. This parameter offers options sqrt or log2 string literals. The default is sqrt.

MODEL_NAME (optional)

Specify the name of the model. The default is Extra Trees Supervised Classifier.

MODEL_DESCRIPTION (optional)

Specify the purpose of the model.

NUM_ESTIMATORS (optional)

Specify the number of decision trees in the forest. The estimators are the predictors of the algorithm. The default is 100.

OUTPUT_MODEL (required)

This is a reference to the output model file.

OUTPUT_MODEL_URI (optional)

Specify a string with the fully qualified filename and path of the associated OUTPUT_MODEL. If you do not specify this property, or set it to an exclamation symbol (!), a temporary file will be created.

Version History

Deep Learning 2.0

Introduced

Deep Learning 2.1

Added MAX_FEATURES and CUSTOM_MAX_FEATURES properties.

See Also

ENVI Machine Learning Algorithms Background, TrainBirch Task, TrainIsolationForest Task, TrainKNeighbors Task, TrainLinearSVM Task, TrainLocalOutlierFactor Task, TrainMiniBatchKMeans Task, TrainNaiveBayes Task, TrainRandomForest Task, TrainRBFSVM Task