Archive

There are two forms of archiving. The first is the Zebra Archiver, which archives only Zebra datastore files into tar files on the archive disk. The second is the migrate function in the /iss/etc/init.d/archive module. Both forms of archiving are controlled and configured in that archive module.

When enabled, the archive module starts the Zebra Archiver and also schedules itself with crontab to run at 15 minutes past every hour. The Archiver runs in the background, waking every 2 hours to write tar files of recent Zebra data which have not yet been archived. (Actually, it archives all data files except the most recent, in case it is still open.) The crontab job runs a series of migrate calls on several data directories, like so:

	migrate $ISS_DSDIR/tklog $platdir/tklog 500
	migrate $ISS_DSDIR/prof915 $platdir/prof915 14
	migrate $ISS_DSDIR/class $platdir/class 30
	migrate $ISS_DSDIR/mapr $platdir/mapr 14
	migrate $ISS_DSDIR/sodar $platdir/sodar 14
	migrate $ISS_DSDIR/soap_plots $platdir/soap_plots 14) \

The migrate method maintains a type of database under the /iss/ds/migrated directory. For each file copied to the archive disk (/jaz) by migrate, there is a file created under the database directory stamped with the time the file was last successfully copied. On each successive call to migrate, migrate only copies those files which are new or which have newer timestamps than the corresponding database entry. The last parameter to migrate is the number of days of data to keep under the datastore directory. The idea is that larger data streams must be aged off more quickly to keep the local datastore partition from filling.

It's arguable that the Zebra Archiver could be abandoned in favor of using only migrate. At the very least, add the Zebra datastore directories to the above list to duplicate the Archiver data stream. One issue which needs careful attention, however, is data scrubbing. Right now the Zebra datastore daemon cleans off some of its data files as the disk fills up beyond a certain capacity. (See /iss/configs/iss/ds.config.) There needs to be some cooperation between the datastore daemon and the archiving method to ensure that data do not get scrubbed which have not been archived, in case the archiving is failing for some reason.

Replacing the Jaz Disk with RAID or Online Storage

All of the archiving processes expect /jaz to be the top directory of the archive disk. Moreover, the processes check for the directories /jaz/platformArchive and /jaz/dsArchive to be sure the disk has been mounted successfully. So to replace a Jaz disk archive with an online disk, such as a RAID, just link /jaz to some directory on the online disk, then create the archive directories. With that, archiving should work just like it does for the Jaz disks.

Ordinarily the Jaz disk filesystem is automounted to increase the chances of the Jaz filesystem surviving unexpected power losses or system crashes. This could probably be done for online disks as well, although presumably a RAID filesystem is using journaling or somehow protects itself from corruption.