NCDF_VARDEF

The NCDF_VARDEF function adds a new variable to an open NetCDF file in define mode.

By default, a new NetCDF variable is chunked (meaning that the data is copied from disk into memory in discrete chunks). You can change the data chunk size with the CHUNK_DIMENSIONS keyword, or set the variable to be stored contiguously.

Examples

For more information on using NCDF_VARDEF, see Additional Examples.

In the following example, we create a NetCDF file, define its structure, and populate it with data. Click on both sections to execute the entire sample.

id = NCDF_CREATE('test.nc', /CLOBBER) ; Create the NetCDF file.

NCDF_ATTPUT, id, 'TITLE', 'Incredibly Important Data', /GLOBAL

NCDF_ATTPUT, id, 'GALAXY', 'Milky Way', /GLOBAL

NCDF_ATTPUT, id, 'PLANET', 'Earth', /GLOBAL

xid = NCDF_DIMDEF(id, 'x', 100) ; Define the X dimension.

yid = NCDF_DIMDEF(id, 'y', 200) ; Define the Y dimension.

zid = NCDF_DIMDEF(id, 'z', /UNLIMITED) ; Define the Z dimension.

vid0 = NCDF_VARDEF(id, 'image0', [yid, xid], /FLOAT)

vid1 = NCDF_VARDEF(id, 'image1', [yid, xid], /FLOAT)

; Rename image0 to dist_image:

dist_id = NCDF_VARID(id, 'image0')

NCDF_VARRENAME, id, vid0, 'dist_image'

NCDF_ATTPUT, id, vid0, 'TITLE', 'DIST_IMAGE'

NCDF_CONTROL, id, /ENDEF ; Put the file into data mode.

image = CONGRID(DIST(200), 200, 100)

NCDF_VARPUT, id, vid0, image

INQ_VID = NCDF_VARINQ(id, 'dist_image')

HELP, INQ_VID, /STRUCTURE

file_inq = NCDF_INQUIRE(id)

HELP, file_inq, /STRUCTURE

NCDF_CLOSE, id ; Close the NetCDF file.

IDL prints:

** Structure <415a7d8>, 5 tags, length=40, data length=40, refs=1:

   NAME            STRING    'dist_image'

   DATATYPE        STRING    'FLOAT'

   NDIMS           LONG                 2

   NATTS           LONG                 1

   DIM             LONG      Array[2]**

** Structure <412da40>, 4 tags, length=16, data length=16, refs=1:

   NDIMS           LONG                 3

   NVARS           LONG                 2

   NGATTS          LONG                 3

   RECDIM          LONG                 2

Syntax

Result = NCDF_VARDEF( Cdfid, Name [, Dim], CHUNK_DIMENSIONS=value, /CONTIGUOUS, GZIP=value, /SHUFFLE, /BYTE, /CHAR, /DOUBLE, /FLOAT, /INT64, /LONG, /SHORT, /STRING, /UBYTE, /UINT64, /ULONG, /USHORT)

Return Value

If successful, the variable ID is returned. If a new variable cannot be defined, NCDF_VARDEF returns -1.

Arguments

Cdfid

The NetCDF ID, returned from a previous call to NCDF_OPEN, NCDF_CREATE, or NCDF_GROUPDEF.

Name

A scalar string containing the variable name.

Dim

An optional vector containing the dimension IDs corresponding to the variable dimensions. If the ID of the unlimited dimension is included, it must be the rightmost element in the array. If Dim is omitted, the variable is assumed to be a scalar.

Keywords

The following keywords specify the data type for the variable. Only one of these keywords can be used. If no data type keyword is specified, FLOAT is used by default.

BYTE

Set this keyword to indicate that the data is composed of bytes.

CHAR

Set this keyword to indicate that the data is composed of bytes (assumed to be ASCII).

CHUNK_DIMENSIONS

Set this keyword equal to a vector containing the chunk dimensions for the variable.

A new NetCDF variable is chunked by default, using a default chunk value that is the full dimension size for limited dimensions, and 1 for unlimited dimensions.

CHUNK_DIMENSIONS must have the same number of elements as the number of dimensions specified by Dim.

Note: This keyword is ignored if the CONTIGUOUS keyword is set.

CONTIGUOUS

Set this keyword to store a NetCDF variable as a single array in a file. Contiguous storage works well for smaller variables such as coordinate variables.

Contiguous storage works only for fixed-sized datasets (those without any unlimited dimensions). You can’t use compression or other filters with contiguous data.

Note: If the CONTIGUOUS keyword is set, the CHUNK_DIMENSIONS and GZIP keywords are ignored.

DOUBLE

Set this keyword to indicate that the data is composed of double-precision floating-point numbers.

FLOAT

Set this keyword to indicate that the data is composed of floating-point numbers.

GZIP

Set this keyword to an integer between zero and nine to specify the level of GZIP compression applied to the variable. Lower compression values result in faster but less efficient compression.

Note: You can only use GZIP compression with NetCDF-4 files.

Note: This keyword is ignored if the CONTIGUOUS keyword is set.

INT64

Set this keyword to indicate that the data is composed of signed eight-byte integers.

Note: 64-bit integer data type can only be used within NetCDF-4 files. Use the NETCDF4_FORMAT keyword to NCDF_CREATE to create a NetCDF-4 file.

LONG

Set this keyword to indicate that the data is composed of longword integers.

SHORT

Set this keyword to indicate that the data is composed of 2-byte integers.

SHUFFLE

Set this keyword to apply the shuffle filter to the variable. If the GZIP keyword is not set, this keyword is ignored.

The shuffle filter de-interlaces blocks of data by reordering individual bytes. Byte shuffling can sometimes increase compression density because bytes in the same block positions often have similar values, and grouping similar values together often leads to more efficient compression.

The previous diagram shows three elements of Array A, where each element variable consists of four bytes. Before byte shuffling, the three variables are represented contiguously in memory as the sequence 012301230123. After byte shuffling, all the bytes in byte position 0 are grouped together, all the bytes in byte position 1 are grouped together, and so on. Now the three variables are represented in memory by the sequence 000111222333.

STRING

Set this keyword to indicate that the data is composed of string.

UBYTE

Set this keyword to indicate that the data is composed of unsigned bytes.

UINT64

Set this keyword to indicate that the data is composed of unsigned eight-byte integers.

Note: 64-bit unsigned integer data type can only be used within NetCDF-4 files. Use the NETCDF4_FORMAT keyword to NCDF_CREATE to create a NetCDF-4 file.

ULONG

Set this keyword to indicate that the data is composed of unsigned four-byte integers.

USHORT

Set this keyword to indicate that the data is composed of unsigned two-byte integers.

Data Chunking

Overview

A data chunk is a hyper-rectangle of any size. When a dataset is chunked, each chunk is read or written as a single I/O operation, and is individually passed from stage to stage of the pipeline and filters.

Upon creation, a NCDF 4 variable is chunked by default (although you can choose to create a contiguous variable). A variable’s chunking setting is established when the variable is created, and can never be changed. You must chunk a variable that uses one or more unlimited dimensions, compression, or a filter.

Choosing Chunk Sizes

When choosing chunk sizes, you should consider:

The dataset size
The data size
How the data will be read
The operating system

Improper chunk sizes can drastically inflate the size of the resulting file or greatly slow the reading of the data. Choose a chunk size such that the subsets of data you are accessing fit into a chunk (the chunks should be as large, or larger than, the subsets you are reading/writing). For an immutable dimension, a rule of thumb is to choose a value that is evenly divisible into the dimension size. Values of less than 100 for variables with dimensions greater than 1000 can result in bloated file sizes.

When creating a NCDF 4 variable, the default chunk value is the full dimension size for limited dimensions, and 1 for unlimited dimensions. However, these values are inefficient for high-performance applications, or for very large (> 2 GB) variables.

You can improve chunking performance by ensuring the cache is large enough to hold at least one chunk. If your machine has a large amount of RAM, you can increase your chunk and cache sizes so that your program reads and writes data in large blocks.

Additional Examples

Example 2

This code sample shows how one might chunk a NetCDF 4 file dataset. Click on both sections to execute the entire sample.

Note: Wait for the first section of the code sample to complete before executing the second section (there may be a delay while reading and writing the large amount of data involved).

filename = FILEPATH('nc_chunking_test.nc', /TMP)

; Create the NCDF 4 file. If the file already exists, destroy

; it and create a new one.

id = NCDF_CREATE(filename, /NETCDF4_FORMAT, /CLOBBER)

; Create some sample data. Think of this as 2,000

; 10 X 10 squares

inData = BINDGEN(100, 2000)

; Define the file variable dimensions. We don't know the

; final size of dimy (it could be variable-size weather data

; that we constantly add to, for example), so we are defining

; this dimension as UNLIMITED

dimx = NCDF_DIMDEF(id,'dimx', 100)

dimy = NCDF_DIMDEF(id,'dimy', /UNLIMITED)

; Define a variable and its dimensions, and add it to the file.

; One way to interpret the chunking setting is that 10 slices

; of the 10 x 10 squares will be read or written at each

; interval

v1 = NCDF_VARDEF(id, 'v1', [dimx, dimy], $

   /BYTE, CHUNK_DIMENSIONS=[100, 10])

; Take the file out of define mode, and into data mode

NCDF_CONTROL, id, /ENDEF

; Populate the variable with data

NCDF_VARPUT, id, 'v1', inData

; Close the file

NCDF_CLOSE, id

; Open the file back up in read-only mode

oid = NCDF_OPEN(filename, /NOWRITE)

; Get the file variable ID, read the file variable, and close

; the file

vid = NCDF_VARID(oid,'v1')

NCDF_VARGET, oid, vid, outData

NCDF_CLOSE, oid

; Compare the input and output data

IF TOTAL(inData NE outData) EQ 0 THEN PRINT, $

   'The data read from file is correct.'

Version History

Pre 4.0	Introduced
8.0	Added CHUNK_DIMENSIONS, CONTIGUOUS, GZIP, and SHUFFLE keywords
8.1	Added STRING, INT64, UBYTE, UINT64, ULONG, and USHORT keywords
9.1	Document INT64 keyword, add note about INT64/UINT64 keywords.