Previous Page Table of Contents Next Page


6. INFORMATION MANAGEMENT


6.1 Introduction

Data and information management are crucial issues for terrestrial observations

Descriptions of the GCOS Data and Information Management Plan (GCOS-13) and the proposed GTOS plans for data management, access and harmonization (ICSU, et al., 1996), have been published and the reader is referred to these and reports of the GCOS Data and Information Management Panel for more details. The following description is intended to provide the reader with an overview of the guiding principles of the information management system needed for terrestrial observations for climate purposes.

Need for data and information management

There is a clear requirement for systems that will promote and facilitate the access by individuals, institutions and national entities to relevant global and regional level information. Especially for the land, observations are often scattered and need to be assembled from a large number of sources and locations and may often require substantial analysis to create internally consistent databases.

Components of data and information management systems

There needs to be explicit links with a variety of different existing data systems. Data sets lie at the heart of any Data and Information System (DIS), including both in situ, remotely-sensed data and also increasingly model outputs. In addition, there is requirement for high quality meta-data describing data sets including such attributes as location, time of collection and data quality. For both the data and meta-data there is a need for tools for input, storage, management, retrieval, display, analysis and output. Rather than any sort of centralized systems, a highly dispersed system is envisaged. What is crucial from the point of view of the TOPC is that for every key observation, responsibility is taken by an organization or entity to carry out these tasks. Another essential aspect of data and information distribution is to ensure agreement to, and implementation of, data policy that leads to free and open access to the data sets.

Making maximum use of existing programmes and expertise

To the fullest extent possible, the GCOS/GTOS data and information systems will rely upon existing national and international programmes. Systems operated by these programmes, such as the ICSU World Data Centres, WWW of the WMO, GOOS, UNEP'S Global Resource Information Database (GRID) and many others, should be enhanced as necessary to meet data requirements rather than creating new entities. While GCOS and GTOS should oversee development of a coordinated information system design that identifies the responsibilities of all cooperating agencies and programmes, it should not normally need to create new institutions to carry out the tasks required for its implementation. Instead, these tasks will be performed by the existing world and national observing systems, telecommunications networks, and data and processing centres. GCOS and GTOS should act to ensure data are collected, validated, processed and archived to the exacting standards necessary. They should also review, monitor and coordinate activities between groups to ensure proper data are being collected and can be exchanged easily.

Ensuring data are of appropriate quality and consistency

In order to meet its exacting scientific objectives, data and products must be of the appropriate quality and consistency to meet stated objectives. GCOS will promote the development of, and adherence to, minimum quality and documentation standards for all relevant data and products. Experts will be invited to review these data sets regarding their quality and documentation to ensure they are of a standard that is acceptable to peers using similar data. Metadata (e.g., calibration and other information about the data themselves) will be assembled and maintained so that it is easily accessible to participants. Achieving harmonization, i.e., internal consistency, can be particularly demanding for terrestrial in situ observations. Considerable attention has been paid in recent years to the harmonization of the legends of databases of properties such as land cover and land use. But there are many other aspects of data harmonization relating to categorical data, continuous data fields such as those from remote sensing, and in situ observations which often require substantial efforts.

Overall system design

To make the best use of existing facilities and expertise, the GCOS data and information system will be based on a hierarchy of local, national, regional, and global institutions. Local centres responsible for data produced by the research community should provide short-term archival and access to pertinent data that they hold, production of data sets, analyses and products, and should forward data and information of enduring value to designated centres. National organizations should be responsible for long-term observation, basic quality control, and routine transmission to world centres. Designated centres should be responsible for advanced quality control (e.g., international comparisons and bias adjustment) and for operational production of data sets, analyses and products.

6.2 Data Management for the Biosphere

It is clear that data and information management for the biosphere will have to rely primarily on the efforts of national institutions. Among recent national developments of international significance is the Distributed Active Archive Centre (DAAC) system in the US developed as part of NASA's Earth Observing System (EOS). In terms of an overall framework for providing international linkages, it is recommended that the data and information management system for terrestrial observations work closely with the World Data Centre System, IGBP-DIS and develop cooperation with UNEP's GRID programme. After the initial data centres are established, it is expected that others will be added to the system.

World Data Centre System

Originally set up to handle data from the International Geophysical Year, the World Data Centre System of the International Council of Scientific Unions has a number of centres whose work is relevant to the biosphere, including soils at Wageningen in the Netherlands and remotely-sensed data for land cover at the Earth Resources Observation Systems (EROS) Data Center in Sioux Falls, USA. However, it is also apparent that there are many types of observations relevant to the biosphere not currently covered by the WDC system.

International Geosphere-Biosphere Programme Data and Information System

The Data and Information System of the International Geosphere-Biosphere Programme (IGBP-DIS) was set up because of the key role of data and information in the scientific success of IGBP. IGBP-DIS is not, and was never intended to be, a hardware-based, data centre responsible for receiving, processing and supplying all data for IGBP. That scale of effort would be an inefficient use of resources, duplicating the efforts of various international and national agencies. Instead, the rationale for IGBP-DIS stems from a number of generic issues for data acquisition and management which cut across the activities of core projects and relate to their integration. Among the most important of these issues are:

Current projects include efforts leading to the creation of a global land data set from the AVHRR, a global land cover product derived from the AVHRR data set, a pedon database describing global soil properties, specification of a global fire database, work with the Committee on Earth Observation Satellites (CEOS) to analyse the implications of their data policy for research applications and development of integrated meta-database systems to permit improved use of high spatial resolution products.

Global Resource Information Database

The Global Resource Information Database (GRID) was established as part of the Global Environmental Monitoring System (GEMS) but subsequently became a separate entity. GRID objectives include enhancing the availability and open exchange of global and regional environmental geo-referenced data sets, providing the United Nations (UN) and intergovernmental bodies with access to improved data management technologies and enabling countries to make use of GRID-compatible technology for national environmental assessment and management. A major GRID activity has been the development of data archives and a meta-data catalogue for data referral purposes. The availability of data through GRID is enhanced by a network of GRID cooperating centres currently in eleven countries (Brazil, Canada, Denmark, Kenya, Japan, Nepal, Norway, Poland, Switzerland, Thailand, and the USA).

6.3 Data Management for the Hydrosphere

Modern data systems can exist nowadays in a distributed form connected and integrated through Internet which does not require a large central office. While the research community will produce many of the required derived products from the raw data, some products may require a central office whose function is to interpret the data. Examples of its activities would be to make timely summaries, or to derive spatial fields by assimilation or by aggregation of the raw data. In the hydrological field, examples include precipitation and runoff. Derived products include gridded fields for monthly data. Other products include zonal, continental or regional totals and their time variation. These are combined with changes in surface and subsurface storage to compute water budgets and assist with model validation.

There are two data centres which can form components of the initial data management programme, the Global Runoff Data Centre (GRDC) and the Global Precipitation Climatology Centre (GPCC). It is essential that GRDC and GPCC be adequately staffed in order to meet these needs. The work of the International Satellite Land Surface Climatology Project (ISLSCP) which is part of WCRP's GEWEX project provides a useful example of the integration of data sets.

The Global Runoff Data Centre

Recognizing the role of river discharge in hydrological and climatological research and monitoring, the GRDC was established in 1988 under the auspices of WMO and is operating at the Federal Institute of Hydrology in Koblenz, Germany. The Centre has close links with other major hydrological data centres including the Water Global Data Monitoring Centre at Burlington, Ontario, Canada, which stores and processes the data for UNEP's GEMS Water network, the GPCC in Offenbach, Germany, and the UNESCO Flow Regimes from International Experiments and Network Data (FRIEND) Project. Mean daily and monthly discharge data are collected on a global scale. For meteorological and climatological applications, the scope of the use of the data are primarily on the global and regional scale. For use in operational hydrology, the scope of the use of the data is mainly on the regional and basin scale.

The Steering Committee of the GRDC at its first meeting in June 1994 recommended that the GRDC should closely liaise with, and participate actively in, GCOS. This sets the path for the expected cooperation of the GRDC with GCOS. Currently the centre has limitations, because of the reluctance of individual nations to send data. To become a fully useful and operational data centre, individual nations must be willing to share data through the centre. Without the cooperation of these nations, whether through GRDC or elsewhere, the utility of a hydrological observing system for GCOS and GTOS is limited.

The Global Precipitation Climatology Centre

Precipitation plays an important role in the global energy and water cycle. The compilation of digital maps and data sets of precipitation covering the Earth's land surface worldwide is the task of the GPCC, which was initiated by the WMO in 1988. The GPCC is a member of the GEWEX Hydrometeorology Panel within the WCRP. The GPCC also participates in the Global Precipitation Climatology Project (GPCP), which is represented in the GEWEX Radiation Panel. The purpose of the GPCP is to derive gridded data sets of monthly precipitation totals covering the entire globe including the oceanic areas, where satellite-based observations are the main data source.

The specific functions of the GPCC in the framework of GPCP are defined by the GPCP Implementation and Data Management Plan (WMO/TD-No. 367) and comprise:

The GPCC also contributes to other international programmes and projects as to the development of the GCOS.

The GPCC is planning to compile and operate an Arctic Precipitation Data Archive for the WCRP Arctic Climate System Study (ACSYS). In this framework, the GPCC's functions will be extended to the collection of snow cover and depth data and to analysis of the liquid water equivalent.

The GPCC is operated by the Deutscher Wetterdienst (DWD, National Meteorological Service of Germany) and is located in Offenbach, Germany.

The GPCC products are gridded area-means of monthly total precipitation, meta data as well as statistical results on the grid and on a monthly basis. The current grid size is 2.5 by 2.5 degree geographical latitude and longitude. The beginning of the data evaluation period is 1986. It is planned to reduce grid size to 1 by 1 degree. Up to now, the following products are available on a monthly basis for 2.5 degree grid cells:

These gridded products are freely available. Access is possible using FTP. The products are partly published on CD-ROM, too.

The only directly observed information on precipitation results from in situ raingauge measurements. However, the data represent geographical points only and need to be integrated over areas using interpolation techniques.

Error studies have shown that data from about 10 stations per 2.5 degree gridcell or area of about 50 000 km2 are required for the calculation of area mean precipitation on the grid within an error benchmark of 10%. If this approach is extrapolated to the total area of the Earth's land surface, it is added up to a number of 40 000 stations, which largely exceeds the amount of the routinely exchanged precipitation data, which are disseminated via GTS and actually available from about 6 000 stations only.

The GPCC data collection includes monthly precipitation totals of more than 40 000 stations, delivered from about 120 countries. However, the spatial distribution of the stations is very unequal, and large data gaps are still existing. The maximum coverage is in the year 1987 (status of data collection in June 1996). Since the data flow from national agencies to the GPCC is voluntary and not generally regulated, the availability of these data is largely delayed.

The station-related observational data are not distributed by the GPCC to other users in order to respect the interests of the data supplying countries.

The GPCC expects from GCOS an internationally regulated exchange with regard to the specific requirement of a higher network density for spatial analysis of precipitation.

International Satellite Land Surface Climatology Project (ISLSCP)

ISLSCP has been carrying out large-scale experiments on Land Surface Climatology for several years and forms part of the WCRP's GEWEX. Recently it has compiled a wide variety of data sets from various sources all on a common global projection and with the same 1 degree resolution and made these available on a CD-ROM. Comments on the quality of the data sets are also included. Several of the data sets also have direct relevance to the biosphere.

6.4 Data Management for the Cryosphere

A strong organizational framework exists for the archiving and dissemination of snow and ice data and information. There are four ICSU World Data Centres for Glaciology: Boulder, Colorado, USA; Moscow, the Russian Federation; Cambridge, UK; and Lanzhou, China. It is proposed that these four centres provide the initial data and information management system for the IOS. WDC-A for Glaciology at the US NSIDC in Boulder, Colorado, has an active programme of archiving and distributing all forms of snow and ice data. WDC-D at the Lanzhou Institute of Glaciology and Cryopedology is assembling data on glaciers and permafrost in China.

The ICSU/Federation of Astronomical and Geophysical Services (FAGS) and UNEP/GEMS World Glacier Monitoring Service in Zurich, Switzerland, assemble data on glacier extent and mass balance. The IPA is sponsoring a project for a Global Geocryological Database. Through NASA's Mission to Planet Earth, the NSIDC DAAC manages satellite and in situ snow and ice data for global change research and the Alaska SAR Facility DAAC manages polar radar data.

There are several major limitations to current data management capabilities:

6.5 Conclusions and Recommendations for Improvements to Data and Information Management

Apart from the specific recommendations and conclusions with respect to the biosphere, hydrosphere and cryosphere, there are a number of general conclusions and recommendations with respect to data and information management.


Previous Page Top of Page Next Page