9.1.1 In its previous report (GCOS Report #32; GCOS, 1997) TOPC identified priority variables for the terrestrial climate-related issues. To assist the user community to whom G3OS is designed to serve, over the last 6 months TOPC set out to identify important existing global data sets for these priority variables. The search was based primarily on the personal knowledge of the TOPC members, supplemented to some extent by web search tools. A number of such data sets were identified, and were discussed by subgroups during the meeting. For uniformity, a common template was used by all members.
9.1.2 An important aspect of the search was to ensure that the data sets are of adequate quality. It was realized that the utility of a data set depends on the purpose and that in practice it may be impossible to decide a priori whether or not a given data set meets the needs of a user. Therefore, an important condition regarding the suitability of a data set is that the data sets be properly documented in terms of data characteristics, calibration and processing, errors, etc. The template thus included several questions regarding data set quality.
9.1.3 The data sets identified through this process are described in Annex III. The experience obtained with this approach is discussed in the next section.
9.2.1 The discussion of data sets revealed several issues regarding the data sets review procedure.
It is clear that the approach is not systematic with respect to a) the data sets found and b) the evaluation of these data sets. Web search tools vary in their ability to find the relevant sites, as do individuals using the web. In addition, the individual conducting a search may not be expert in the subject scientific specialisation, resulting in web site data set collections which may not contain the best data sets. Thus, the web searches are likely to generate non-comprehensive data set lists, not necessarily the best data sets, nor data sets that are critically reviewed through this process (although they are likely to have been peer-reviewed as part of the data set production). One solution to the problem is to identify best experts regarding each variable and associated data sets, and either request them to conduct the search and evaluation, or request their review of the searches conducted by others. TOPC members can serve as experts for a number of, but not all, key variables.
Data sets accessible through the web appear, move, and disappear from the web at a high rate. Several very relevant and critical data sets (e.g., global biomass fire distribution, global soils characteristics) are expected to become available this summer, while many others are under development. Older data sets are being moved to new web sites, and others are being removed from the web as more appropriate data replace them. Hence, there is a fundamental need in a web-based environmental data system, such as that used here, for frequent and comprehensive updating.
For numerous TOPC variables, adequate global-scale data sets do not exist at the present. For others, the data sets have inadequate spatial or temporal resolution, or accuracy. Many of these deficiencies can only be improved through future activities of G3OS and similar programs.
There are inconsistencies in the information presented with the data sets. On the other hand, it was found that many data sets provide similar type of information. It is the degree of completeness which differs between the data sets. There is a need to define, and then widely publicise, a common set of needed types of information that should be provided with all data sets for G3OS-related applications.
It would be highly advantageous to potential users of a data set to have ready access to bibliographic citations in which the data sets have been used, especially where an evaluation of the strengths and weaknesses of the data are concerned. This is potentially a demanding task because it implies positive identification of a data set (even its specific version) within a particular citation. The only easy way is for the data users to inform producers that the data have been used and to provide a reference, and then for producers to systematically update the metadata for their data sets;
Several variants of the same data set may exist. Some versions may be more processed or quality-controlled than others and thus may be more useful to researchers. Some researchers, however, may prefer the raw data.
Certain kinds of data for some countries have limited access.
The usefulness of a data set (at least for some applications) may be contingent on the availability of data sets for associated variables. This was not addressed in the search and is not easily handled by the procedure employed.
The process used in this data set search was intended to locate those which have sufficient quality information for the potential user to make informed decisions regarding the data set. For the reasons discussed above the data sets identified in this manner cannot be considered to be approved by TOPC. When they are highlighted as part of the GOSIC (see Section 10.1), an appropriate disclaimer/user beware should therefore appear.
The data search undertaken by TOPC was an experiment. The ultimate test of the usefulness and effectiveness of this mechanism will be the degree to which it helps the G3OS community. In the G3OS implementation, a provision for feedback by the users should therefore be made.
9.2.2 The search carried out so far concentrated on global data sets. There are many continental or regional data sets of substantially higher quality. For users interested in specific regions, the data sets identified here are not the best starting point.
9.3.1 TOPC discussed the available data sets for terrestrial climate requirements, and the members noted problems and limitations of this approach. The results of this effort (Annex XI) will be forwarded to GOSIC to be entered on the web site.
9.3.2 The meeting participants endorsed the decision of JDIMP with respect to the division of responsibilities for the G3OS data sets identification and for the criteria for accepting a data set for the purposes of the G3OS (Section 2.3 above). The identification of additional or improved data sets will continue to be important. Given the limited resources available to TOPC such effort should continue to focus on global data sets. The meeting participants also emphasised that JDIMP should continue to deal with cross-cutting issues related to data access, particularly those related to policies, pricing and intellectual property aspects; and should promote harmonisation of metadata and associated metadata formats.
9.3.3 The following specific recommendations are made.
Recommendation 16: GOSIC should establish direct links to the data sets identified in Annex XI, and should include, in an appropriate form, the information from Annex III associated with each data set.
Recommendation 17: TOPC members should help identify data sets and develop the associated information, based on their scientific expertise and familiarity with these data sets.
Recommendation 18: In searches such as undertaken here, an attempt should be made to involve the best experts regarding each variable and associated data sets.
Recommendation 19: If feasible, G3OSs should consider the development of web-based tools for: a) effective searches that would identify new, improved existing, and deleted data sets; b) mechanisms for locating citations/studies employing certain data sets; and c) searches for groups of thematically or geographically related data sets.
Recommendation 20: JDIMP should define, and then widely publicize, a common set of metadata fields which should be provided with all data sets for G3OS-related applications. The metadata format should allow flexibility in the way the information is presented, i.e. it should be defined thematically rather than digitally.
Recommendation 21: GOSIC should include an appropriate disclaimer with the information contained in Annex 3.
Recommendation 22: GOSIC should make a provision for feedback by the users of the information in Annex 3.