If you have labeled your training rasters using the Deep Learning Labeling Tool, you can start training the model from within the Labeling Tool.

Click the Train drop-down and select the type of model you want to train. Click on a link below to get more details.

Train a Pixel Segmentation Model


  1. From the Train drop-down list in the Labeling Tool, select Pixel Segmentation. The Train Deep Learning Pixel Segmentation Model dialog appears.
  2. An architecture is a set of parameters that defines the underlying convolutional neural network. From the Model Architecture drop-down list, select the model architecture to use during training. The options are:

    • SegUNet++: (default) Recommended for training on structural objects, such as vehicles, buildings, shipping containers.

    • SegUNet: Recommended for training on features that are more inconsistent in appearance, such as debris and clouds.

    • DeepLabV3+: A fast option for training that provids good results

    SegUNet and SegUNet++) are based on the U-Net architecture, SegUNet is based on work by Ronneberger, Fischer, and Brox (2015). Like U-Net, they are mask-based, encoder-decoder architectures that classify every pixel in the image. DeepLabV3+ is based on ResNet50.

  3. Select a Patch Size from the drop-down list. Patches are small images given to the model for training. The Patch Size is the number of pixels along one edge of a square patch. The default value is 464, which means the patch size is 464 x 464 pixels. Larger patch sizes result in faster classification, but for training, the patch size must be smaller than the size of the label rasters and also small enough that at least one patch per batch fits into graphics memory. In general, if your label rasters are smaller than the default patch size, choose the largest size that will fit in your rasters. If your graphics card has more than 8 GB of memory, you may want to increase the Patch Size.
  4. Use the Training/Validation Split (%) slider to specify the percentage of data to use for training versus validation.
  5. Enable the Shuffle Rasters check box to shuffle the training and validation rasters before splitting them. This ensures that the training is not biased. Disabling this option means that the same images will be used for training and validation with each training run, which will help achieve more repeatable results.
  6. In the Feature Patch Percentage field, specify the percentage of patches that contain labeled features to use during training. Values should range from 0 to 1. This applies to both the training and validation datasets. The default value is 1, which means that 100% of the patches that contain features will be used for training. The resulting patches are then used as input to the Background Patch Ratio, described in the next step.

    Example: Suppose that a label raster has 50 patches that contain labeled features. A Feature Patch Percentage value of 0.4 means that 20 of those patches will be used for training (20/50 = 0.4, or 40%).

    The default value of 1 ensures that you are training on all of the features that you labeled. In general, if you have a large training dataset (hundreds of images), lowering the Feature Patch Percentage will decrease training time.

  7.  In the Background Patch Ratio field, enter the ratio of background patches (those that contain no labeled features) to patches with features. For example, a ratio of 1.0 for 100 patches with features would provide 100 patches without features. The default value is 0.15.

    When features are sparse in a training raster, the training can be biased by empty patches throughout. The Background Patch Ratio parameter allows you to restrict the number of empty patches, relative to those that contain features. Increasing the value tends to reduce false positives, particularly when features are sparse.

  8. In the Number of Epochs field, enter the number of epochs to run. An epoch is a full pass of the entire training dataset through the algorithm's learning process. Training parameters are adjusted at the end of each epoch. The default value is 25. See Epochs and Batches.
  9. Set the Augmentation Scale option to Yes to augment the training data with resized (scaled) versions of the data. See Data Augmentation.
  10. Set the Augmentation Rotation option to Yes to augment the training data with rotated versions of the data. See Data Augmentation.
  11. The Solid Distance field pertains to point and polyline labels only. For each class, enter the number of pixels surrounding point or polyline labels that should be considered part of the target feature. See the Solid Distance background discussion for more information. To use the same value for all classes, click the Set all to same value button . In the Set Value dialog, enter the value to use for all classes and click OK.
  12. The Blur Distance field is used in conjunction with Solid Distance. You can optionally blur features of interest that vary in size. Blurring the edges of features and decreasing the blur during training can help the model gradually focus on the feature of interest. In most cases, you can leave this field blank; however, it is available for you to experiment with. See the Blur Distance background discussion for more information. To use the same value for all classes, click the Set all to same value button . In the Set Value dialog, enter the value to use for all classes and click OK.
  13. In the Class Weight field, enter the minimum and maximum weights for having a more even balance of classes (including background) when sampling. Diversity of sampling is weighted by the maximum value at the beginning of training and decreased to the minimum value at the end of training. The useful range for the maximum value is between 1 and 6. A general recommendation is to set the Min to 2 and the Max to 3 when your features of interest are sparse in your training rasters. Otherwise, set them from 0 to 1. See Training Parameters.
  14. In the Loss Weight field, enter a value between 0 and 1.0. A value of 0 is a good starting point and will be fine for most cases. A value of 0 means the model will treat feature and background pixels equally. Increased values will bias the loss function to place more emphasis on correctly identifying feature pixels than identifying background pixels. This is useful when features are sparse or if not all of the features are labeled.
  15. Specify a filename (.h5) and location for the Output Model. This will be the "best" trained model, which is the model from the epoch with the lowest validation loss. By default, the tool will save the best and last model. Most of the time, the best model will perform the best compared to the last model, but not always. Having both outputs lets you choose which model works best for your scenario.
  16. Specify a filename (.h5) and location for the Output Last Model. This will be the trained model from the last epoch.
  17. To run the process in the background, click Run Task in the Background.

  18. Click OK. Training a model takes a significant amount of time due to the computations involved. Depending on your system and graphics hardware, processing can take several minutes to several hours. A Training Model dialog shows the progress of training, along with the updated validation loss value.

    At the same time, a TensorBoard page displays in a new web browser. TensorBoard is a visualization toolkit included with TensorFlow. It reports real-time metrics such as Loss, Accuracy, Precision, and Recall during training. See View Training Metrics for details.

When training is complete, you can pass the trained model to the TensorFlow Pixel Classification tool.

Train an Object Detection Model


  1. From the Train drop-down list in the Labeling Tool, select Object Detection. The Train Deep Learning Object Detection Model dialog appears.
  2. Use the Training/Validation Split (%) slider to specify the percentage of data to use for training versus validation.
  3. Enable the Shuffle Rasters check box to shuffle the training and validation rasters before splitting them. This ensures that the training is not biased. Disabling this option means that the same images will be used for training and validation with each training run, which will help achieve more repeatable results.
  4. Set the Pad Small Features option to Yes when features are small; for example: vehicles, utilities, road markings, etc. Features with bounding boxes drawn around them must have at least 25 pixels in the X and Y directions. If the labeled features are smaller than this, the Pad Small Features option will pad them with extra pixels so they are at least 25 pixels in both directions.

  5. Set the Augmentation Rotation option to Yes to augment the training data with rotated versions of the data.
  6. Set the Augmentation Scale option to Yes to augment the training data with resized (scaled) versions of the data.
  7. In the Number of Epochs field, enter the number of epochs to run. An epoch is a full pass of the entire training dataset through the algorithm's learning process. Training parameters are adjusted at the end of each epoch. The default value is 100.
  8. In the Patches per Batch field, specify the number of patches to run per batch. A patch is a small image subset passed to the trainer to help it learn what a feature looks like. The default patch size used for object detection is 640 x 640 pixels. A batch comprises one iteration of training; model parameters are adjusted at the end of each iteration. Batches are run in an epoch until the number of patches per epoch is met or exceeded. The default value is 1.

    The Patches per Batch parameter controls how much data you send to the trainer in each batch. This is directly tied to how much GPU memory you have available. With higher amounts of GPU memory, you can increase the Patches per Batch.The following table shows the amount of GPU memory successfully tested with different values:

    GPU memory (MB)

    Patches per Batch

    5099

    1

    5611

    2

    9707

    3-4

    10731

    5-8

    11711

    9-10

  9. In the Feature Patch Percentage field, specify the percentage of patches that contain labeled features to use during training. Values should range from 0 to 1. This applies to both the training and validation datasets. The default value is 1, which means that 100% of the patches that contain features will be used for training. The resulting patches are then used as input to the Background Patch Ratio, described in the next step.

    Example: Suppose that an object detection raster has 50 patches that contain labeled features. A Feature Patch Percentage of 0.4 means that 20 of those patches will be used for training (20/50 = 0.4, or 40%).

    The default value of 1 ensures that you are training on all of the features that you labeled. In general, if you have a large training dataset (hundreds of images), lowering the Feature Patch Percentage will decrease training time.

  10. In the Background Patch Ratio field, enter a value between 0 and 1.0. A value of 0 is a good starting point and will be fine for most cases. A value of 0 means the model will treat feature and background pixels equally. Increased values will bias the loss function to place more emphasis on correctly identifying feature pixels than identifying background pixels. This is useful when features are sparse or if not all of the features are labeled.
  11. Specify a filename (.h5) and location for the Output Model. This will be the "best" trained model, which is the model from the epoch with the lowest validation loss. By default, the tool will save the best and last model. Most of the time, the best model will perform the best compared to the last model, but not always. Having both outputs lets you choose which model works best for your scenario.
  12. Specify a filename (.h5) and location for the Output Last Model. This will be the trained model from the last epoch.
  13. To run the process in the background, click Run Task in the Background.

  14. Click OK. Training a model takes a significant amount of time due to the computations involved. Depending on your system and graphics hardware, processing can take several minutes to several hours. A Training Model dialog shows the progress of training, along with the updated validation loss value.

    At the same time, a TensorBoard page displays in a new web browser. TensorBoard is a visualization toolkit included with TensorFlow. It reports real-time metrics such as Loss, Accuracy, Precision, and Recall during training. See View Training Metrics for details.

When training is complete, you can pass the trained model to the TensorFlow Object Classification tool.

Train a Grid Model


Grid models can be trained to use with TensorFlow Optimized Object Classification or TensorFlow Optimized Pixel Classification.

  1. From the Train drop-down list in the Labeling Tool, select Grid. The Train Deep Learning Grid Model dialog appears.
  2. From the Model Architecture drop-down list, specify the model architecture to use for training the model. Pre-trained weights for the given architecture will be used as a starting point to enhance model performance. The options are:

    • ResNet50

    • ResNet101

  3. In the Grid Size field, enter the size of the grid bounding box. This number is squared and also represents the patch rows and columns dimensions. The grid size must be a multiple of 32. The default is 224.

  4. Use the Training/Validation Split (%) slider to specify the percentage of data to use for training versus validation. Or, use the field provided to enter a value.
  5. Enable the Shuffle Rasters check box to shuffle the training and validation rasters before splitting them. This ensures that the training is not biased. Disabling this option means that the same images will be used for training and validation with each training run, which will help achieve more repeatable results.
  6. In the Number of Epochs field, enter the number of epochs to run. An epoch is a full pass of the entire training dataset through the algorithm's learning process. Training parameters are adjusted at the end of each epoch. The default value is 100.
  7. In the Patches per Batch field, specify the number of patches to run per batch. A batch comprises one iteration of training; model parameters are adjusted at the end of each iteration. Batches are run in an epoch until the number of patches per epoch is met or exceeded. The default value is 5.
  8. In the Feature Patch Percentage field, enter the percentage of patches containing labeled features to use during training. Values should range from 0 to 1 and applies to both the training and validation datasets. The default value is 1. Example: You have a grid raster with 50 patches that contain labeled features. A value of 0.4 means that 20 of those patches will be used for training (20/50 = 0.4, or 40%), whereas a value of 1 means 100% of those patches will be used for training. The number of resulting patches is used as input to the Background Patch Ratio field. Note that if you have a large training dataset (hundreds of images), lowering the Feature Patch Percentage will reduce training time.

  9. In the Background Patch Ratio field, enter a value between 0 and 1.0. A value of 0 is a good starting point and will be fine for most cases. A value of 0 means the model will treat feature and background pixels equally. Increased values will bias the loss function to place more emphasis on correctly identifying feature pixels than identifying background pixels. This is useful when features are sparse or if not all of the features are labeled.
  10. Use the Pad Small Features option when creating a grid model to use in optimized object classification. Set the Pad Small Features to Yes when features are small; for example: vehicles, utilities, road markings, etc. Features with bounding boxes drawn around them must have at least 25 pixels in the X and Y directions. If the labeled features are smaller than this, they will be padded with extra pixels so they are at least 25 pixels in both directions.

  11. Set the Augmentation Scale option to Yes to augment the training data with resized (scaled) versions of the data.
  12. Set the Augmentation Rotation option to Yes to augment the training data with rotated versions of the data.
  13. Specify a filename (.h5) and location for the Output Model. This will be the "best" trained model, which is the model from the epoch with the lowest validation loss. By default, the tool will save the best and last model. Most of the time, the best model will perform the best compared to the last model, but not always. Having both outputs lets you choose which model works best for your scenario.
  14. Specify a filename (.h5) and location for the Output Last Model. This will be the trained model from the last epoch.
  15. To run the process in the background, click Run Task in the Background.

  16. Click OK. Training a model takes a significant amount of time due to the computations involved. Depending on your system and graphics hardware, processing can take several minutes to several hours. A Training Model dialog shows the progress of training, along with the updated validation loss value.

    At the same time, a TensorBoard page displays in a new web browser. TensorBoard is a visualization toolkit included with TensorFlow. It reports real-time metrics such as Loss, Accuracy, Precision, and Recall during training. See View Training Metrics for details.

When training is complete, you can pass the trained model to the TensorFlow Grid Classification tool.