Developing High-Quality Field Program Sounding Datasets

1. Introduction

For nearly half a century, field experiments have been conducted from the tropics to the polar regions involving observations of atmosphere, ocean and land processes over selected locations of interest. The figure below shows a map depicting the shape, size, and location of many of these sonde networks.

Our experience working with sounding data from several field programs (TOGA COARE, SCSMEX, NAME, TiMREX) has led to the design of a general procedure for creating easy-to-use, bias-reduced, quality-controlled sonde datasets. We encourage those involved with future field programs with upper-air sonde networks to make use of this procedure and the freely available software described herein. The steps in this general procedure are outlined in the diagram below and are described in the following sections. Software tools for implementing this procedure are provided in Section 4. Examples of all suggested file formats are given in Section 5. Please refer to the BAMS article (Ciesielski et al., 2011) for additional details.

Developing a high-quality sounding dataset in a field program requires both quality assurance and quality control of the data. Quality assurance involves the preparations made prior to taking observations that ensure the measurements are properly taken. In regards to sounding data, these preparations should include the following:

  1. To the extent possible, the launch site should be representative of the larger-scale environment (e.g., avoid launching on hot asphalt surfaces or sheltered areas between large buildings).
  2. Precisely determine geolocation of launch site (e.g., starting elevation of sonde launch impacts geopotential height computation at all levels).
  3. Since accurate surface measurements can help identify sonde biases, use surface data from collocated (both vertically and horizontally), well-calibrated surface instruments.
  4. Use newly calibrated, non-contaminated sondes. Older sondes (> 1 yr old) tend to have larger biases.
  5. For Vaisala RS92 sondes, keep fresh desiccant in ground check chamber. Replace desiccant if RH > 1% (Miloshevich et al. 2009).
  6. Adequately ventilate sonde prior to launch (to equilibrate sonde sensors to ambient conditions).
  7. For proper ventilation of sensors during ascent, use enough gas in balloon to achieve ascent rate of ~4 m/s.
  8. Collocate other instruments to provide independent, redundant measurements to identify biases, such as using a ground-based GPS system to obtain independent estimates of PW (Wang and Zhang 2008).
  9. If multiple sonde types and ground station systems will be used, sonde intercomparison launches made prior to and during the course of the experiment are helpful for identifying platform biases.
  10. Sonde manufacturers and instrument developers should be encouraged to record engineering and housekeeping data, as well as other meta data in the raw data files. Such information can be quite useful in monitoring instrument performance, investigating bad data and potentially correcting it.
  11. Operator judgment should be exercised with the option of postponing the launch when releasing sondes into strong convective storms (for safety reasons, balloon icing issues and poor representation of large scale environment).

Furthermore, we recommend that a naming convention similar to that shown in the list below be adopted to help creators and users of these datasets specify precisely what level of processed data they are using in their research.

Suggested Dataset Naming Convention

Having L4u and L4c available, helps one to assess the impacts of the corrections applied in creating the L3 dataset.

2. The Procedure

The first step in processing field program sonde data sets should be to convert the soundings into a single, easily utilized format. We recommend the D-file format used by NCAR EOL and refer to this as Level 1 processing. An example of the D-file format is contained in sample_dfile.txt along with a Fortran program for converting a Vaisala TSV file sample_tsv.txt into a D-file format, tsv_2_Dfile.f.

These Level 1 efforts should be followed by Level 2 (L2) processing in which the high-vertical resolution sounding data are passed through a series of automated QC algorithms to systematically detect bad values. ASPEN (Atmospheric Sounding Processing ENvironment), an example of such a tool developed by NCAR EOL, can be run in a Windows PC or UNIX environment. In addition to removing egregious data based on several objective QC checks, ASPEN filters the winds, computes geopotential height, smooths pressure and writes out the processed QC'ed sondes in one of many convenient formats. If using ASPEN, we recomment writing the files in EOL format (sample_eol.txt).

In Level 3 processing, sonde biases are identified and reduced if possible. While sonde manufacturers are continually striving to improve the accuracy of sonde sensors, water vapor retrievals continue to be the most problematic. Several methods for identifying and reducing humidity biases are suggested in Ciesielski et al. (2011). While a number of approaches are discussed in Section 3 of this paper for reducing humidity biases, unfortunately no generalized software is available to handle this problem which is often quite specific for any particular field program.

In the final processing stage (Level 4 or L4), a "user-friendly" version of the sonde dataset is created. The L4 datasets have values at uniform vertical resolution and quality control flags on each variable to provide users with a measure of the data's reliability. Several Fortran programs are provided below for creating the L4 dataset. An example of a an easy-to-read ASCII format containing interpolated data with quality control flags format is contained in upa5i_99810. Details for producing L4 files are given in the following section.

3. Details for creating L4 files

The first step in creating an L4 dataset is to interpolate the high vertical resolution sonde thermodynamic data (z, T, Td) onto uniform vertical levels (either pressure or height). Uniform pressure cooridinates have the advantage of dividing the atmosphere into layers of equal mass, while uniform height coordinates provide much higher resolution at upper levels. In the examples of xsnd shown here, the sonde data has been interpolated to a uniform 5-hP resolution. Interpolation should only be done if the vertical spacing between data points is 50 hPa or less. Typically, data spacing in the hi-res sondes is 1-2 seconds (or 5-10 meters, i.e., much less than 1 hPa). While sonde system software and L2 software like ASPEN provide vertical smoothing of the winds, some residual high frequency noise is generally still present in sonde winds (primarily from the oscillation of the sonde package below the balloon). Thus, averaging winds in uniform layers, which produces a slight smoothing effect, is preferred to interpolating them. L4 winds are computed by averaging hi-res winds in uniform layers centered on the level in question and are computed only if the hi-res wind data are present in the layer under consideration. The Fortran program create_upa_prs_L4.f processes a series of high-resolution sonde files into a single uniform pressure resolution L4 file. Alternatively, the Fortran program create_upa_hgt_L4.f processes a series of high-resolution sonde files into a single uniform height resolution L4 file. The input to these programs should be a chronological list of high-resolution sonde file names (called "files_hires").

The next step in creating the L4 dataset is to pass the uniform pressure data through a series of objective QC tests following the approach of Loehrer et al. (1996) and assign quality control flags to each data value at each level. This can be done by using the Fortran programs: create_qcstats_prs_L4.f and assign_qcf_prs_L4.f for uniform pressure files or create_qcstats_hgt_L4.f and assign_qcf_hgt_L4.f for uniform height files. .

Table 1. Convention adopted for the QC flags in L4 sonde files.

Flag Value Meaning
1 parameter good
2 parameter "objectively" questionable
3 parameter "visually" questionable
4 parameter "objectively" bad
5 parameter "visually" bad
6 parameter interpolated
7 parameter estimated
8 parameter unchecked
9 parameter missing

Objectively determined quality (flag value = 2 or 4) come from an objective test, while subjectively determined quality (flag value = 3 or 5) come from visual inspection of the sonde. There are two main types of objective tests: gross limit checks and vertical consistency checks.

Gross limit checks

At each site, the temporal mean (over all sondes at that site) and standard deviation (σ) are computed at each 5 hPa vertical level. Data outliers that are 4σ from the mean are removed and the statistics (mean and standard deviation) are recomputed. The second-pass statistics are then used as follows: QC flags are objectively assigned on each value as questionable (QCflag = 2) if its value is > 3σ from the temporal mean, or bad (QCflag = 4) if the value is > 6σ from its mean.

Vertical consistency checks

The following vertical consistency checks are applied.

Hydrostatic tests

parameter test flag value level affected
Pressure (p) dp > 0 as z increases 4 current
Geopotential height (z) dz > 0 as p decreases 4 current

Lapse rate tests for strong superadiabatic layers

parameter test flag value level affected
T, Td dT/dz > -15°C/km and dp > 5 hPa 2 current one and above
T, Td dT/dz > -30°C/km and dp > 5 hPa 4 current one and above
T, Td dT/dz > -100°C/km and dp < 5 hPa† 2 current one and above
T, Td dT/dz > -200°C/km and dp < 5 hPa† 4 current one and above

† cases where dp < 5 hPa occur between the surface point and the next pressure level.

Dew point gradient at surface

parameter test flag value level affected
Td Td(sfc) - Td (1st point above surface) > 3°C 2 sfc and point above
Td Td(sfc) - Td (1st point above surface) > 6°C 4 sfc and point above

Using a visual editor to adjust QC flags

Following the objective QC flag assignment, each sounding should be visually inspected. While tedious, this processing stage is necessary to ensure a research quality dataset since subtle errors in the sonde data are difficult to identify with objective procedures. To facilitate this processing we have delevoped a software tool called xsnd (examine soundings) which allows one to visually examine vertical profiles of thermodynamic and wind variables up to 100 hPa. The xsnd tool was written in Tcl/Tk, an easy to learn scripting language, which runs in a UNIX, Windows PC or Macintosh environment. The version of xsnd we provide requires input data in a specific format created by running "create_upa_prs_L4.f" (for uniform pressure resolution) or "create_upa_hgt_L4.f" (for uniform height resolution). However xsnd can be easily revised to accept other input formats as well. Although the template for extending xsnd to visualize data above 100 hPa is in place, to do so would require a substantial effort and at this time has yet to be done. An example of how the xsnd displays a sounding is shown in the figure below with thermodynamic data in the left-hand panel in Skew T-log p format and wind displayed in the right-hand panel as vertical profiles of wind components. Dot colors indicates their quality with white dots for good data, blue dots for questionable data and red dots for bad data. To toggle between adjacent sondes in a file, simply click on the previous and next buttons. See Ciesielski et al. 2011 for addtional details.

Using xsnd allows one to easily "buddy check" the data, i.e., visually compare sondes adjacent in time and close proximity in space to each for continiuty of features. Using the xsnd tool, one subjectively adjusts the QC flags by simply clicking on the data value. Good data are represented by white dots. Suspect data are marked as questionable by clicking once on a data point (dot color changes to blue), or "bad" by clicking a second time on the data point (dot color changes to red). Examples and suggestions for using this tool are provided in Ciesielski et al. (2011).

Using xsnd allows one to easily "buddy check" the data, i.e., visually compare sondes adjacent in time and close proximity in space to each for continiuty of features. Using the xsnd tool, one subjectively adjusts the QC flags by simply clicking on the data value. Good data are represented by white dots. Suspect data are marked as questionable by clicking once on a data point (dot color changes to blue), or "bad" by clicking a second time on the data point (dot color changes to red). Examples and suggestions for using this tool are provided in Ciesielski et al. (2011).

The xsnd tool is executed by typing xsnd at the command prompt followed by the filename one wants to process: (e.g.,> xsnd upapi_99810). Ensure that xsnd has executable permission (e.g., in UNIX > chmod a+x xsnd).

Note: If a sonde sensor is suspect, for whatever reason, its data are suspect. There may be the appearance of reasonable data, which compare well with other observations, but as long as something triggered the statement that a sonde sensor is suspect, even reasonable looking data may be incorrect. Such data should be used only with great caution since there is usually no way to tell if the data are correct, which is why the term "suspect" is used.

4. Links to All Software Tools

Disclaimer: These software programs have been tested with a variety of input variables. However if a combination of inputs causes a program to crash or give unreasonable results, please report problems to the contact person listed below.

5. Examples of Various Sounding Data File Formats Discussed Above


Ciesielski, P. E., P. H. Haertel, R. H. Johnson, J. Wang, and S. Loehrer, 2011: Developing High-Quality Field Program Sounding Datasets. Submitted to BAMS. (link to this article will appear after paper is accepted for publication).

Loehrer, S. M., T. A. Edmands, and J. A. Moore, 1996: TOGA COARE upper-air sounding data archive: development and quality control procedures. Bull. Amer. Meteor. Soc., 77, 2651-2671.

Miloshevich, L. M., H. Vömel, D. N. Whitman, and T. Leblanc, 2009: Accuracy assessment and correction of Vaisala RS92 radiosonde water vapor measurements. J. Geophys. Res., 114, D11305, doi:10.1029/2008JD011565.

Wang, J., and L. Zhang, 2008: Systematic errors in global radiosonde precipitable water data from comparisons with ground-based GPS measurements. J. Climate, 21, 2218-2238.

Contact person: Paul Ciesielski: