SECTION 4 -- GCIP MAP for 1997, 1998 and Outlook for 1999

4. DATA ASSIMILATION

The NAS/NRC GEWEX Panel in its review of the GCIP Objectives recommended that more emphasis should be placed on data assimilation and should be included as one of the GCIP objectives:

Develop and evaluate atmospheric, land, and coupled data assimilation schemes that incorporate both remote and in-situ observations.

4.1 Background

Improved understanding of the hydrological cycle depends critically on atmospheric and surface fields which synthesize various observations in a manner consistent with constraints inherent in the physical laws governing evolution of these fields. Typically, these constraints are applied through the equations solved in a state-of-the-art forecast model. This process of data synthesis is known as data assimilation.

In operational numerical weather prediction (NWP), data assimilation has become recognized, over the last 10 years, as nearly equivalent in importance to model development for improvement of model forecasts of all time durations, from a few hours to many days or weeks. Forecast error is understood now to be as often a function of inadequate initial conditions as from model deficiencies.

The data assimilation challenges facing GCIP are essentially those facing mesoscale meteorology, but are further complicated by the need to account for land surface and hydrological processes. Atmospheric data assimilation techniques are designed to minimize analysis error in an undetermined problem; that is, conditions must be estimated at many grid points where no data exist. Furthermore, account must be made for varying data error characteristics and irregular spatial and temporal sampling in those observations. This problem of underdeterminacy is particularly serious regarding surface fields, where observations are sparse and often representative only of very local regions.

The basic shortcoming in the current observational database is a lack of coincident data in time and space for estimating energy and water budget components. Limitations arising from the diverse nature of observational platforms and their associated algorithms are well known. Some variables, such as precipitation, soil moisture, and runoff, can be observed adequately at point locations but only with greater uncertainty at large spatial scales. Some variables integrate in nature over time and space, e.g., streamflow, aerological determination of evaporation, and precipitation difference, but are poorly related to instantaneous point processes. Some variables, particularly the surface latent and sensible heat fluxes and soil moisture , are not directly observable over large regions. In this case 4DDA methods become an essential strategic methodology for incorporating various data into models that will be validated with GCIP data sets. On the other hand, many characteristics of the surface do not change in time and data sets of these variables are being gathered with increasing precision and spatial coverage.

Data assimilation is also important for GCIP to provide improved analyses of moisture fields in the atmosphere. These moisture fields are a product of the full dynamic/physical processes in the atmosphere and surface, so ultimately, GCIP must be concerned with the full data assimilation process. Currently, research in data assimilation is related to forward static techniques which use a forecast model only in a forward sense, and to more fully 4-dimensional techniques which fit observations to a model state integrated over some time period. In the forward techniques, model forecasts are corrected at different points in time based on current observations. These techniques include the commonly used optimal interpolation statistical technique and 3-D variational techniques. The frequency with which observations are incorporated can vary to as often as every model time step, in which case the assimilation is sometimes called nudging. The 4-dimensional variational techniques may have greater potential for improvement of initial conditions, but are much more computationally expensive.

Another recent impetus to data assimilation research has been the availability of new data sources, including wind profilers, commercial aircraft, Doppler radars (reflectivity and radial winds), and improved satellite sensors. The variational technique provides an improved framework for assimilation of these observations, many of which are not explicitly forecast by the forecast model (e.g., satellite- observed radiances). The use of raw observations rather than processed retrievals (e.g., temperature and moisture soundings derived from satellite radiances) has been recognized as providing improved information from these sources.

Based on these considerations, the principal areas in data assimilation for GCIP are summarized as follows:

- improved algorithms that translate from observation variables to model variables and vice versa (e.g., radiative transfer models, hydrological models);

- incorporation of new data sources (which must pass the test of providing additional information over that already known from other sources and the model forecast), and also process rates such as rainfall rate, streamflow, and TOA radiative fluxes, and various soil-moisture measurements; and

- understanding of uncertainty in GCIP analyzed data sets.

4.2 GCIP Needs For Model Assimilated Data Sets

The major components of the hydrological cycle are soil moisture, surface evaporation, water vapor, clouds, rainfall, and runoff. The first two components are not observed routinely over continental areas such as the GCIP domain. The GCIP analyses of soil moisture and surface evaporation must therefore be products of a 4DDA system. For the long GCIP time period, such assimilations can be provided conveniently only by on-line operational centers.

Modern 4-D data assimilation systems use objective analysis techniques combined with advanced atmospheric forecast models to blend observations of varying types, timeliness, accuracy, and spatial coverage into self-consistent uniformly gridded fields of atmospheric and surface fields. For fields that are not observed (or very sparsely observed), 4DDA systems rely on the atmospheric model to generate realistic analyses based on the internal physical and dynamic coupling within the model to those fields that are observed.

The moisture cycle in models is largely determined by subgrid scale parameterizations, which typically drive atmospheric models rather quickly to an equilibrium between evaporation and precipitation, both of which are crucial to the terrestrial water cycle. The model's moisture equilibrium may be realistic but upset in assimilation by incorrect data; on the other hand, good data may be subverted in the assimilation by systematic deficiencies and biases in the model.

In this context, Lorenc (1992) emphasized that the vast detailed information generated when fitting the model to data in the assimilation process provides unique tools to diagnose the model or data weaknesses. The extensive long-term GCIP database will provide substantially enhanced opportunities to do just that for components of the water and energy cycles not routinely observed, leading to assimilation improvements, which, in turn, over the GCIP period will lead to more realistic representations of these cycles. Hence, together, the special GCIP observations and operational 4DDA systems (including their periodic upgrades growing out of GCIP research) represent a synergistic opportunity to improve both specification and simulation of the global energy and water cycles. To take advantage of this opportunity, operational assimilation products will require extensive diagnosis and validation by GCIP researchers.

For operational NWP and 4DDA systems, then, this operational plan, coupled with the companion research plan in Volume II (IGPO, 1994a), must achieve the following tasks:

(2) Identification of shortcomings by comparison with observations (especially exploiting the long-term character of the GCIP observation enhancements).

(3) Implementation of improvements, especially assimilation improvements and physical parameterization improvements, stemming from concurrent GCIP modeling research.

With today's advancements in computer power, it is widely accepted that the separation between climate models and NWP models is becoming less pronounced. Taking advantage of the long time scales and breadth of observations and model output of GCIP, researchers can quantify the behavior of a range of operational NWP systems over a range of spatial resolutions, physical complexity, and data assimilation approaches to help identify those key water and energy cycle components and scales that climate models must ultimately include to achieve a new level of reliability.

4.3 Observational Data For GCIP Data Assimilation

An inventory of possible data for assimilation includes the following:

a. Surface-related data

b. Atmospheric data

4.4 Data Assimilation Techniques Relevant To GCIP

a. Surface-related

b. Atmosphere related

c. Assessment of model and observational errors

While some investigation of single-sensor data and processing may be appropriate in some circumstances, the emphasis for GCIP should be on assimilation of different types of data together and doing so in the context of coupled models. The success of various diagnostic budget studies of the hydrological cycle is critically dependent on the quality of these analyses.

4.5 Near-Term Priorities

- Research into improved atmospheric assimilation techniques and incorporation into regional mesoscale models.
- Research into soil property assimilation
- Evaluation of assimilation data sets
- Use of soil and hydrological model adjoints for assimilation of process rate variables (This is an item which requires a significant amount of research before it can be used as an applied data assimilation technique. However, it shows sufficient potential that such research activity needs to be included as a priority item to get started during the next two years.)