Corrupted Data Recording

S-Pol IHOP_2002, Oklahoma Panhandle
May/Jun 2002

During IHOP_2002, S-Pol experienced intermittent drop-outs when writing data to tape. Real time data showed no drop-outs, but the real time data stream did not include all variables required for post-analysis of "full polarimetric" data.

The data drop-outs on tape represent a significant problem. The issue was noted in the field, but the severity of the problem was considerably under-estimated. A cause of the problem could not be found during the field-phase of the experiment.



S-Pol recorded dual data streams by independent recording systems during the field project. A method has been developed to merge the two sets of tapes created, producing an almost complete data set. Data loss for the entire project is estimated at less than 1%.

Tape drop-outs had diurnal dependence, and were also dependent upon the sweep number within a volume scan. Sweeps at the beginning of a volume suffered the most impact (these are the low-level sweeps). Volumes collected during the over-night shift showed significantly higher data loss than those collected during the day when scientists were present.

Post review of the data and review of logs and memories has led to determination of the cause of the recording problem. It turns out that both recording systems were sharing the same over-flow disk cache file for tape buffering. The buffer was highly dynamic, and used only in short bursts, but there was sufficient corruption of the buffer to create data drop-outs. The disk cache tended to be used:

Examples of data drop-outs due to the tape cache problem are provided.

There are other artifacts in the data that can be confused with the
cache problem, but are due to other factors.  One specific instance
has to do with the initialization of the PIRAQ Radar Signal Processor
upon starting a scan volume.  The PIRAQ effect appears as 2 to 5 noisy
beams at the start of a volume, which for IHOP has generally been the
.5-degree sweep.

Also easy to confuse with the cache dropout problem are the following:

	* Sunset and sunrise

	* A small black line appearing in solo displays at 45, 135,
          225, or 315 degrees.  This is an effect due to radar beam
          spacing and solo image pixel rendering

	* Second trip echoes (these can often be quite narrow)

There are other data artifacts that have caused small amounts of S-Pol
data loss.  The largest of these is occasional drop-out of the S-Pol
transmitter; this occured very infrequently.

A Case Study of Lost Data

To give an idea of the amount of data loss due to the tape cache
problem, a full 24-hour set of S-Pol data was reviewed.  The data
reviewed were merged from the two versions of the S-Pol tapes.

The 24 hour period was 4-Jun-2002 1200 UTC through 5-Jun-2002 1200.
There were 2600 sweeps over this time period.

This day was considered a day for which there was little data loss due
  to tape writing problems.  At this point in the review, it is
  difficult to tell if this is a typical day for data loss.  We have
  compiled summary indicators for some days that indirectly show data
  loss, and expect that about half of all days will show approximately
  the same amount of data loss; one-quarter of all days will show
  twice as much or more data loss.  We're unsure of the remaining
  one-quarter days (all these are very rough estimates).

Of the 2600 sweeps on 4/5 June, 

	 15 Sweeps showed out-and-out missing sectors, with size ranging 
		  from 3 to 90 degrees
		  ( 1 of 90-deg, 3 of ~10-deg, and 11 < 5-deg)

	131 Sweeps included at least one corrupted radar beam.  The 
		  breakdown is as follows:

		69 sweeps w/  1 bad beam
		21	  w/  2 bad beams
		 6	  w/  3
		13	  w/  4
		13	  w/  5 or 6
		 3	  w/  7 to 9
		 6	  w/ 10 to 20
		 0	  w/ > 20

(Not included in the estimates of bad beams are those associated with
PIRAQ start-of-volume problems; about one-third of all .5 degree
sweeps may be affect with 2 to 5 bad initial beams).

Realtime Scans preserved for June 2002

During field operations, sweepfiles were written to disk in realtime, bypassing the tape recording processes. These sweepfiles were uncorrupted by the tape writing problem. Unfortunately, it was necessary to perform on-going purges of the realtime sweepfiles to free-up disk space for more current opertations. Also, the realtime sweepfiles did not contain the full suite of polarimetric variables, thereby limiting the utility of the realtime sweeps for certain applications.

Never-the-less, the realtime sweepfiles have been presereved for the majority of the month of June. Zdr bias has not been removed for these sweeps, but no other adjustments were required to the realtime dataset, so the realtime data are considered high-quality. These sweeps are available from ATD only by special request. It is suggested that the realtime sweeps (when they exist) might represent the best source of information for re-computing the Fabry index of refraction parameters. It is also noted that the user should exercise caution in using a mix of post-processed and realtime sweeps, to avoid confusion over Zdr bias corrections.

Consideration is being given to compiling a sub-set of the realtime sweeps that includes only the 0.0° tilt, as an aide in re-computing index of refraction.

Note that there are occassions where the realtime sweepfiles are missing a beam or two, compared to the post-processed merged sweepfiles. This is an unexpected occurance, and may be due to a tendency for some beams to be introduced in the processing in an out-of-order fashion (a small tendency for this has been noted during past projects).

Impact on data delivery

Most of the non-rebuilt data are already available on-line from NCAR. ATD has started the process of replacing the incomplete scans with merged scans. Replacement of all scans is expected to be complete by early December, 2002.

See Status Report for details.

--- Bob Rilling --- / NCAR Atmospheric Technology Division
Created: Fri Nov 1 09:17:21 MST 2002
Last modified: