12. COMPILATION OF DATA SETS

The intent of GCIP researchers to rely as much as possible on existing data centers as the archive location of GCIP data means that data sets will be geographically distributed among these data centers. The GCIP-DMSS is compiling a centralized set of information on the data sets. In some cases, this set consists of a directory and inventory of the data set, and in other cases it will consist of only directory information with the inventory information available from the data center where the data set is stored.

12.1 Compiled Data Sets

There is an ongoing need to compile data sets for purposes such as publishing on CD- ROMs or for specific periods such as the Enhanced Seasonal Observing Periods. The compiled data sets are any GCIP data compiled for a GCIP user or set of users in such a way as to facilitate ease of accessing and using the data. For purposes of organizing the data compilation activity, three different types of compiled data sets are recognized: A standard data set is one with specifications that are agreed to before the data collection period starts so that standing orders can be provided to the data centers. Agreement on the specifications will be reached at the project level on a year-by-year basis. Funds will be identified and committed by the Project sponsors for each standard data set at the time the specifications agreement is formalized. The primary purpose of the standard data sets is to give wide distribution, especially internationally, to specific GCIP data to encourage analysis, research, and modeling studies. The current plans for compiling GCIP standard data sets are summarized in
Figure 12-1. Further details about each of the standard data sets are given in the remainder of this section. A summary of the GCIP data sets compiled to date is given in Appendix D.


[datasets]

Figure 12-1 Compiled and Planned Standard Data Sets for GCIP Research.


A custom data set is one that is either distributed or compiled at a central location and made easily accessible for a group research effort. Applications of custom data sets include validation and/or comparison of algorithms, energy and water budget studies, and model evaluation studies. The primary purpose of custom data sets is to facilitate "group" research efforts on GCIP-relevant topics. The group requesting the data set will agree to the specifications for the custom data sets. Requests will be submitted to the GCIP office for funding the preparation of the custom data set. Funds will be identified and committed by the Project for each custom data set at the time the request is approved.

The primary purpose of the as requested data set is to enable any user to order a data set with individual specifications from any of the individual data sets listed in the GCIP master catalog or data set guides. The GCIP-DMSS will provide assistance to the user to compile information about data availability to facilitate ordering data sets to specification. Incremental costs for compiling and distributing an as requested data set will in most cases be borne by the user making the request.

12.2 EOP Data Collection Plans for Continental Scale Areas (CSAs)

The list of data to be collected for the complete CSA during each year of the EOP are given in Table 12-1 for In-Situ data, Table 12-2 for Model Output data and Table 12-3 for Satellite Remote Sensing data. Additional datasets may be added as required.


Table 12-1.  In-Situ Data Sets for CSA During the EOP
________________________________________________________________________________________________
DATA TYPE                                                                      DATA AVAILABILITY
------------------------------------------------------------------------------------------------
                           Surface                                              Module  Center
------------------------------------------------------------------------------------------------
EOP Hourly Surface Composite                                                       X     JOSS
EOP Hourly Precipitation Composite                                                 X     JOSS
EOP Daily Precipitation Composite                                                  X     JOSS
1-hr data from the ASOS Network (both comissioned and non-commissioned sites)      X     JOSS
1-hr data from SAO Stations (NWS and FAA)                                                NCDC
1-hr data from NOAA Wind Profiler Demonstration Network (WPDN) Stations                  NCDC
1-hr data from the Oklahoma Mesonet Network                                               OCS
1-hr data from the Illinois Climate Network (ICN)                                         ICN
1-hr data from the High Plains Climate Network (HPCN)                                    HPCC
1-hr data from the USDA SNOTEL Network                                                   USDA
1-hr and daily precipitation data from the NWS Cooperative Observer Network              NCDC
Daily data from the the NWS Cooperative Observer Network                                 NCDC
Daily streamflow from data from the USGS and USACE Networks                              USGS
Daily streamflow and precipitation data from TVA                                          TVA
1-hr data from the USDA/Agricultural Research Service (ARS)                               OCS
1-hr radiation data from the NOAA SURFRAD Network                                         FSL
Available Soil Moisture data from the USDA/SCS, USDA/ARS, DOE/ARM/CART, and ICN    X     JOSS
1-hr surface observations from the DOE Southern Great Plains ARM/CART site                DOE
Will be others from other LSAs to be determined
------------------------------------------------------------------------------------------------
                          Upper Air
------------------------------------------------------------------------------------------------
1-hr data from the NOAA Wind Profiler Demonstration Network (WPDN)                       NCDC
12-hr high-resolution (6-sec vertical level) rawinsonde data from the NWS                NCDC
12-hr Eta Model MOLTS Soundings (state parameters only)                                  NCAR
ACARS and CASH flight data from commercial aircraft                                       FSL
------------------------------------------------------------------------------------------------
                            Radar
------------------------------------------------------------------------------------------------
1-hr NIDS 2-km radar reflectivity composite                                        X     JOSS
1-hr NASA/MSFC 8-km National precipitation composite (derived from reflectivity)         MSFC
1-hr and daily WSR-88D Stage III product composite (all available RFCs)            X     JOSS
WSR-88D Site Level II Archive Data                                                       NCDC


Table 12-2. Model Output Data for CSA During the EOP
_____________________________________________________________________________

                    DATA DESCRIPTION                        DATA AVAILABILITY
-----------------------------------------------------------------------------
MODEL DATA
-----------------------------------------------------------------------------
                   Atmospheric Regional Models	            Module   Center
-----------------------------------------------------------------------------
Eta Data Assimilation System (EDAS) (3-hrly)                  X	  
Eta Model Forecast (12-hrly)                                  X
Eta Model Initialization Analysis GIF Imagery (daily; UTC)          UCAR/JOSS
Eta Model Location Time Series (hrly) (MOLTS)                 X
Eta Model Reduced Data Set (3-hrly) (MORDS)                   X  
Eta Fixed Fields (including land surface)                     X  
RFE Model Analyses (8-hrly) (MORDS)                           X
RFE Model Forecasts (12-hrly) (MORDS)                         X  
RFE 3-D Fields                                                       AES/CMC
RFE Model Location Time Series (hrly)                         X  
RFE Fixed Fields (including land surface)                     X
MAPS Model Output 3-D Fields                                         NOAA/FSL
MAPS Model Output (MOLTS & MORDS)                             X
-----------------------------------------------------------------------------
                    Atmospheric Global Models
-----------------------------------------------------------------------------
NMC Medium Range Forecasts (MRF) (12-hrly)                           NCAR/DSS
CMC Global Spectral Model (12-hrly)                                  AES/CMC
ECMWF Medium Range WX Fost Model (Daily)                              ECMWF
NMC Climate Data Assimilation System (CDAS) (Daily)                  NCAR/DSS
-----------------------------------------------------------------------------
                        Hydrology Models
-----------------------------------------------------------------------------
RFC Hydrology Model Data (8-hrly)                            TBD       TBD
-----------------------------------------------------------------------------
                     Derived Data Products
-----------------------------------------------------------------------------
National Precipitation Analysis (Daily)	                      X      NCAR/DSS


Table 12-3.  Satellite Remote Sensing Data for CSA during the EOP
_____________________________________________________________________________
                                
                        DATA DESCRIPTION           DATA AVAILABILITY
-----------------------------------------------------------------------------
SATELLITE DATA                                      MODULE    CENTER
-----------------------------------------------------------------------------
POES Radiation Budget Data (4/day)
     -    Outgoing longwave (AVHRR)                            NCDC
     -    Planetary albedo (AVHRR)                             NCDC
     -    Downward longwave (HIRS)                             NCDC
     -    Longwave cooling rate (HIRS)                         NCDC
     -    Outgoing longwave (HIRS)                             NCDC

GOES Radiation Budget Data (hrly)
     -    Outgoing longwave (Sounder)                           TBD
     -    Downward longwave (Sounder)                           TBD
     -    Longwave cooling rate (Sounder)                       TBD
     -    Insolation/PAR                                       NCDC
     -    Clear sky surface temperature	                       NCDC

POES/AVHRR Vegetation Index (Weekly/Monthly)                   NCDC
DMSP/SSM/I Snowcover (Daily)                                  NOHRSC
POES/CLAVR Clouds (2/day)                                      NCDC
GOES/ASOS Clouds (hrly)                                        NCDC
GOES Conus Sector Imagery (IR, VIS, WV) (hourly)             UCAR/JOSS
Gridded Areal Snow Cover (Weekly)                             NOHRSC
Gridded Areal Snow Cover (Daily)                                TBD
Gridded Snow Water Equivalent (Weekly)                        NOHRSC
Gridded Snow Water Equivalent (Daily)                           TBD

12.3 Data Collection for ESOP-96

The ESOP-96 data can be divided into three major data categories: In situ, satellite, and model. The responsibility in data collection will fall under each module of the GCIP Data Management and Service System (DMSS) described in
Section 13. Although most of the data sources are operational in nature, special arrangements were made to obtain these data in the highest resolution possible. Table 12-4 summarizes the individual datasets comprising the ESOP-96. In addition, an initial phase of compiling a near surface observational data set from the Little Washita Watershed and the ARM/CART site is being completed for the period of April to September 1996 (see section 12.8 for further details). The ESOP-96 Tactical Data Collection and Management Plan provides more details including a brief description of each dataset with information regarding data collection, processing, and final archival and information on dataset disseminationafter the compilation is completed in June 1997. Information on the final ESOP-96 datasets will be provided in the ESOP-96 Tactical Data Collection and Management Report to be completed after the data compilation is complete.


TABLE 12-4  Datasets comprising the ESOP-96
______________________________________________________________________________

IN-SITU DATA

SATELLITE DATA

MODEL OUTPUT


12.4 EOP-2 Data Collection During WY 1997

The plans for data collection for the second year of the EOP take account of the following general requirements.
(i) The ESOP-97 is scheduled for the period 1 October 1996 through 31 May 1997 in the geographical region identified as the LSA-NC for data to conduct focused studies on cold season/region hydrometeorology.
(ii) The CSA data requirements are continuing for energy and water budget studies with an increase in emphasis on model evaluation for the regional model output.
(iii) Annual data sets for the LSA-SW and LSA-NC are required for energy and water budgets over an annual cycle plus model evaluations of the regional model output.

Data Collection for ESOP-97

A summary listing of the data collection plans for ESOP-97 is given in Table 12-5.

The ESOP-97 Tactical Data Collection and Management Plan provides more details including a brief description of each dataset with information regarding data collection, processing, and final archival and information on dataset dissemination after the compilation is completed in June 1998.


TABLE 12-5  Datasets comprising the ESOP-97
______________________________________________________________________________

IN-SITU DATA

SATELLITE DATA

MODEL OUTPUT


12.5 EOP-3 Data Collection During WY 1998

The data collection plans during WY 1998 takes account of the following known requirements : The proposed data sets for the LSA-E are shown in Table 12-6 for in-situ data and Table 12-7 for satellite remote sensing data. The current plans for model output data for the LSA-E are the same as that given in Table 12-2 for the CSA.


Table 12-6. Proposed In-Situ Data for LSA-E During WY 1998 and WY 1999.
___________________________________________________________________________________

IN-SITU DATA



Table 12-7.  Proposed Satellite Remote Sensing Data During WY 1998 and WY 1999 Applicable for the LSA-E
_________________________________________________________________________
          DATA DESCRIPTION                          DATA AVAILABILITY
                                                  MODULE    DATA CENTER
-------------------------------------------------------------------------
Composite Daily Snow Depth Grid                                 NCDC
Composite Daily Snow Cover (GOES, POES, DMSP)       X      NESDIS, NOHRSC
3-Day Composite DMSP SSM/I Snow Cover               X          NOHRSC
Composite Weekly Snow Cover Extent                             NESDIS
Monthly DMSP SSM/I Snow Cover in Percent            X           NCDC

Hourly GOES-8 1 km Visible (for LSA-E)                        UCAR OFPS
Daily POES AVHRR 1 km (Land Cover/Vegeatation)               NOHRSC, EDC

Daily DMSP SSM/I Brightness Temperatures            X         MSFC DAAC
Daily DMSP SSM/T2 Radiances                         X         MSFC DAAC
Daily DMSP OLS Visible Imagery                                  NGDC
Daily DMSP OLS IR Imagery                                       NGDC

POES Radiation Budget Data (4-Day)                              NCDC
POES Radiation Budget Data (hourly)                             NCDC

Composite Gridded Snow Water Equivalent *           X          NOHRSC
Composite Gridded Soil Moisture *                   X          NOHRSC

Landsat Thematic Mapper Imagery                                  EDC
----------
* Data from aircraft, satellite, and surface sources.

12.6 EOP-4 Data Collection During WY 1999

The data collection plans for EOP-4 are expected to be very similiar to those for EOP- 3 given in the previous section with the addition of LSA-NW

12.7 Retrospective Data Sets

OBJECTIVE: Develop high-quality retrospective databases of surface observations, especially precipitation observations, surface meteorological observations, and streamflow for use in calibration of key surface parameters in atmospheric and hydrological models.

Historical hydrometeorological data are needed to develop, validate, and estimate parameters in improved surface parameterizations for atmospheric models. The required period of hydrological data must include several extreme wet and extreme dry periods in which soil moisture levels reach maximum and minimum values. Usually this period ranges from 10 to 30 years, depending on the local climate and actual occurrence of events. At least 30 years is needed to put the EOP in a climatological context. Spatially, all available precipitation measurements are needed to obtain the best possible water budgets over areas of 10^3 to 10^4 km^2.

For GCIP, long periods of retrospective, high-quality hydrometeorological data are critical because the statistical variability of extremes (that is, flood and drought) is essential in assessing the impact of climate variability on water resources. A portion of the total retrospective data needs is being compiled within the NWS/OH as part of the NOAA Core Project for GCIP. Retrospective data are a critical input to the NWP model upgrades. At present, models of surface hydrology must be calibrated using historical precipitation, evaporation, temperature, and other climatological data, together with streamflow data. Similar calibrations using 30 to 50 years of data are needed to run the models from which will be determined the key hydrological parameters of soil moisture capacity and runoff formulation required by the upgraded NWP models and required to global models.

The data types required include precipitation, air temperature, streamflow, and meteorological observations to estimate water and energy fluxes between the surface and the atmosphere. The primary source of historical data is surface observations, but archived NWP model outputs and some historical satellite data may be required as well.

The preparation of historical data sets is directly linked to the development of the NOAA Hydrological Data System which was described in Appendix E of the GCIP Major Activities Plan for 1995, 1996 and Outlook for 1997 (IGPO, 1994c).

12.8 Near Surface Observation Data Set

The second near-term objective for this GCIP major thrust area for 1996 to 1998 is - - to produce a quantitative assessment of the accuracy and reliability of the model assimilated and derived variables for applications to energy and water budgets. The successful achievement of this objective will entail an extensive evaluation of both the model output and the derived variables. All of the evaluations require a lengthy series of observed data for those variables considered significant . As a start on this evaluation effort, GCIP is compiling a special data set of observations for as many of the variables as reasonably available. In order to maximize the number of observed variables this special data set is focused on the region of the ARM/CART site and the Little Washita Watershed during the period April 1, 1996 through March 31, 1998.

Since 1993, GCIP has been working in cooperation with other projects and activities in the Arkansas-Red River basin to compile datasets for GCIP research activities. These include the Atmospheric Radiation Measurement (ARM) program, the USDA/Agriculture Research Service in El Reno, OK and the Oklahoma Climate Survey. GCIP has also supported enhancements to existing observation networks to obtain observations crucial for studying and modeling land surface processes and the coupling of these processes with the atmosphere. The support for soil moisture and soil temperature profile measurements in the ARM/CART site and the Little Washita Watershed (shown in Figure 7-1) is particularly noteworthy.

The implementation of this enhanced observation capability has advanced to where it is now feasible to begin compiling a special dataset for land surface and boundary layer studies and modeling. The GCIP/DACOM has compiled a set of data requirements that will be suitable for:

12.8.1 Summary Description of a Near-Surface Observation Dataset

A special dataset is being compiled for the geographical area which includes both the ARM/CART site and the Little Washita Watershed as shown in
Figure 7-1. The vertical dimension will include from 3000 meters above the surface to two meters below the surface. The specific types of observations are listed in Table 12-8 which is divided into three parts: The land surface studies and models can use the data at point locations to force land surface models or can make use of the observations to complete an area analysis for different size areas within the ARM/CART site and the Little Washita Watershed. The difficulty in achieving a consensus on the techniques for an area analysis has necessitated a decision to compile data as close as possible to an observational measurement. This will enable an investigator to use whatever analysis techniques are deemed appropriate for their specific research.


TABLE 12-8.   Near Surface Observation Types in each Layer
____________________________________________________________________________

1. Boundary Layer Z < 3000 meters

2. Surface (0 < Z <10 meters)

3. Sub-surface    (-2 < Z < 0 meters)


12.8.2 Data Collection Schedule for Near Surface Observation Data Set

It is recognized that a full year data collection period is the most desired by the persons surveyed. However, due to the implementation schedule of the full complement of enhanced observations it was decided to postpone the start of a one-year data collection period until 1 April 1997. Since a partial dataset containing the critical measurements would be useful to GCIP investigators as soon as possible the data collection is divided into two phases.

Phase I - The six-month period of 1 April through 30 September 1996 encompasses the scheduled data collection period for the Enhanced Seasonal Observing Period (ESOP-96) for the LSA-SW shown in Figure 7-1. The first phase of the Near-Surface Observation Dataset is making use of data from this same period. During ESOP-96 we obtained a reasonably complete set of data at about eight locations in the ARM/CART site (see SWATS facilities in Figure 10-?) and Little Washita Watershed. The remaining locations do not have some of the observation types including particularly, soil moisture and soil temperature profiles. This is being compiled as part of a special subset of the ESOP-96 dataset. The compilation of this dataset is scheduled to be completed by June 1997. A proposed list of observations contained in this dataset is outlined in Table 12-6. A complete description is included in Appendix A of the ESOP-96 Tactical Data Collection and Management Plan.

Phase II - The full complement of observing systems needed for the Near-Surface Observation Dataset are scheduled to be operating by the end of March 1997. We are therefore planning to start the Phase II data collection period on 1 April 1997 and continue for one full year.

The preparation of the archive data for streamflow by the U.S. Geological Survey (USGS) is done on a Water Year (1 October to 30 September) basis. The streamflow data for the Water Year are archived the following April and May. This will necessitate the compilation of the one-year Near Surface Observation Dataset in two parts. The period from 1 April through 30 September 1997 can be completed by June 1998 and the last six months of the one year dataset will be completed by June 1999. It may be possible to compile a full year dataset earlier (June 1998) using operational streamflow data and replacing this with the archived data when it becomes available. This will depend upon the needs of the GCIP investigators.