Bounding Boxes Background


With object detection, you label features of interest using rectangle annotations. These are also called bounding boxes. Each bounding box represents a specific feature belonging to a specific class. The following example shows bounding box labels drawn around different types of vehicles:

In the ENVI API, bounding box information is stored as an IDL hash. A hash is a compound data type that contains key-value pairs of different data types. The following is a sample hash that defines one class. The hash will likely have multiple classes.

In this example, class 0 is defined by its own hash containing class properties such as label and color. The coordinates property is a [2,5] array of pixel coordinates for the associated bounding box. The first and last sets of coordinates are the same.

If you use the Deep Learning Labeling Tool to draw bounding boxes for object detection, ENVI ultimately puts the bounding box information in this hash format. Specifically, when you click the Train button in the Labeling Tool, ENVI internally calls the BuildObjectDetectionRasterFromAnnotation task. That task collects the bounding box labels, colors, and coordinates stored in the generated annotation file (.anz) created by the Labeling Tool. ENVI stores this information in a hash for each class.

FeatureCollection of Feature Types


After reading all annotation data, ENVI constructs a GeoJSON FeatureCollection from the initial data structure. The FeatureCollection contains a list of features. Each feature (defined by "type" : "Feature") is treated as a single object with one set of properties and pixel coordinates. Here is an example of what this looks like with one feature:

The GeoJSON code is compressed and base64-encoded before being written to the associated ENVI header file (.hdr) of the label raster that was created in the Labeling Tool. The training step reads the bounding box information from the header files of the label rasters. It uses that information as input to the TensorFlow object detection process.

Outside of the Labeing Tool, the format described above is how bounding box information is stored in the following ENVI API routines/methods:

FeatureCollection of Geometry Types


ENVI also stores bounding box information in a different GeoJSON format. This format similarly contains a FeatureCollection with a list of features. However, each feature contains one or more GeometryCollections, one per ROI class. Each GeometryCollection contains one set of properties and multiple coordinates that represent all bounding boxes for the given collection.

This format is used to store bounding box information for the following API routines/methods:

Here is an example of this format: