Managing the USB Disks
and Recording at S-Pol

Gan, Maldives
Sep-Dec 2011
Jan 2012



This is an incomplete procedure, as not all details have been developed.

The machine spol-dm has been configured for recording S-PolKa primary data, and for recording a local copy of SMART-R data. Two removable USB disks should be attached to spol-dm at all times. spol-dm must be operational at all times, and it is best if that machine is not used for other activities, or at least, any activities that could force or require a reboot. (Scientists, particularly, should be discouraged from using spol-dm).

sci3 is configured the same as spol-dm, and represents a fail-over machine for data recording. sci3 has the two USB disks attached, and those disks are being recorded at the same time as those on spol-dm. However, the sci3 disks will be cleaned-off if the spol-dm disks are verified as "good". The sci3 disks get scrubbed and then transferred to spol-dm (only after the spol-dm disks are safely removed)

How many days on a disk? This is a somewhat uncertain issue. Quite a few of the data sets have data thresholded or removed under certain, specific circumstances. For instance, reflectivity with SND below, say, -3 dB might be removed, and blank-filled. This allows file compression to make the data sets smaller. The data sets will be smaller on a clear day, than on a day with wide-spread precip. Still, on average, we expect to get 5 to 7 days out of a disk.

Monitoring

Handling

A Note on logging into Sci3

Assuming you are working on the spol-dm terminal, you can just open another window, then "shell" into sci3. So:

Changing disks

When nagios says the spol-dm disks are nearly full (80% to 90% ?), do the following:

Handling the backup disks from sci3

A note about general disk reliability

It is expected that the USB Passport Drives could have a high failure rate. JVA (?) had a discussion with a Western Digital engineer. The engineer stated that the error rates are phenomenally high on these small, consumer USB disks. However, most of these errors are somewhat transient, and they are able to fix the errors in the software (using pretty fancy parity/error checking). When these disks fail hard, they fail completely and quickly.

We purchased 72 of the Passport drives. Each was slow-formatted, completely written with test bits, then purged. There were 3 that did not pass the tests, and another two that were very slow to write, so were not brought to Gan. Web reviews complain about failures, but an actual failure rate cannot be determined. We simply expect some of the disks to fail.

See here for formatting of USB disks for use by the archival system.

Failure modes might include:

If any of these happen, put a note on the disk, and use another. If the disk should have been written, replace the disk with one from sci3.


--- Bob Rilling --- / NCAR Earth Observing Laboratory
Created: Sun Oct 2 09:57:31 GMT 2011
Last modified: