This page explains how metadata is used in the AQUASTAT core database. The metadata structure itself is based on the EURO-SDMX structure, which can be consulted for concept definitions. There are three levels of metadata: data-point level, variable level, and database level.
Data-point level
At a data-point level, there might be comments that either add a caveat or further explain the data-point. The metadata presented was harvested from about 15 years of institutional memory, not all of which was external. Several thousand external metadata have been attached to data, along with over 50 000 quality-control internal metadata. Regarding the categorization of the metadata, we have selected nine categories, explained as such:
- Reference area: Is there anything particular about the country? For example: "Includes island X" or "Only region Y was covered" or "These results are only for city Z".
- Reference period: Are there any details about the dates? For example: "The average takes into account the average precipitation between 1961-1990" or "Data actually from 1998".
- Comparability (geographical): Is there anything that makes this number difficult to compare to numbers in other countries? For example: "The definition calls for using method A, but this country uses method B".
- Comparability (over time): Did anything happen to make numbers not comparable to numbers for this country in other years? For example: "The country included the precipitation from a new station in 1998, which makes the data after 1998 incompatible with the older data".
- Adjustment: Has any nationally reported data been changed by AQUASTAT? For example: "There was a typo xx in the literature from the country that was corrected to x in the database".
- Overall accuracy: Is there any known quality issue regarding this data-point? For example: "The country's municipal water withdrawal tripled in five years".
- Components: Are there any details about what that data-point consists of? For example: "Flow of border rivers is 10, consisting of 5 from river A, 3 from river B, and 2 from river C". Metadata is only entered in the case that the specific component is not another data-point itself. For example, the total area equipped for irrigation will not have metadata stating its components since those components are themselves variables that can be queried.
- Observations: General contextual notes about a country. For example: "The country is one of the driest countries in the world".
- Methodology: If the method by which a value is derived is known but doesn't fit into any of the categories above, it is provided here.
Variable level
The next level of aggregation is at the variable level. This metadata can be accessed by clicking on the icon to the right of each variable in the database query page. View the core database now!
Database level
The highest level of aggregation is the database level. Below you can find general information about the AQUASTAT modus operandi, properly categorized.
Database name:
AQUASTAT main country database
- Metadata last update: November 2020
Statistical presentation:
- Data description: The data pertains to water resources, water withdrawals and uses, and agricultural water management (including irrigation), although some general statistics are also provided.
- Sector coverage: Water and irrigated agriculture sectors.
- Statistical concepts and definitions: A complete list of the database variables is available in the database and definitions can be accessed by clicking on the icon to the right of each variable in the database query page.
- Statistical unit: Global, regional, national.
- Statistical population: N/A
- Time coverage: 1960 - present.
- Base period: circa 2005.
Unit of Measure:
Each variable's unit is noted in the variable metadata accessible from the icon to the right of each variable in the database query page.
Reference period:
For water resources, the reference period is 1961-1990, unless otherwise noted.
Institutional mandate:
This database supports Article 1 of FAO's constitution that requires FAO to "collect, analyze, interpret, and disseminate information related to nutrition, food, and agriculture".
Confidentiality:
Release policy:
- Release calendar policy: Data originating from the FAO questionnaire on water and agriculture is released in January every year, however continuous update is made when new data become available.
- Release calendar access: N/A
- User access: Access to AQUASTAT data is provided to external users through the AQUASTAT webpage, and through the websites of pre-selected partners that provide AQUASTAT data.
Frequency of dissemination:
Continuously, as available.
- News release: Major updates are announced on AQUASTAT's home page, as well as via Twitter.
- Publications: AQUASTAT publications are listed in the Resources section.
- Online database: The core database on country statistics is the principal method of data dissemination, although other databases are also available in the Databases section.
- Micro-data access: N/A
- Other dissemination: N/A
Accessibility of documentation:
- Documentation on methodology: Methodology documents are available in the Overview section under the corresponding thematic sub-section.
- Documentation on quality: AQUASTAT's procedure to ensure quality results is described in the AQUASTAT methodology section.
Quality management:
- Quality assurance: N/A
- Quality assessment: Some variables are easier to gather and standardize than others. Consequently, data quality is perceived to be heterogeneous across the different variables that AQUASTAT gathers. Where possible, harmonization is performed with other agencies to maximize data quality. Frequently, data quality assessments result in significant changes which lead to back-changes in the complete data series.
Relevance:
- User needs: User needs are varied, but frequently include sub-national and sub-yearly data. Data in AQUASTAT is available mostly at a national level, for 5-year periods.
- User satisfaction: While no surveys have been completed, AQUASTAT is held in good esteem by the international community, as evidenced by the fact that AQUASTAT statistics are almost always referenced or used as a point of comparison in any study relevant to national-level water resources, water uses and/or irrigation.
- Completeness: Substantial data gaps exist, as AQUASTAT contains statistics mostly derived from national entities, and performs only limited modelling.
Accuracy and reliability:
- Overall accuracy: No formal quality assessment has been conducted.
- Sampling error: N/A
- Non-sampling error: N/A
Timeliness and punctuality:
- Timeliness: Data is collected through the annual FAO questionnaire on water and agriculture. The data validation and upload process takes on average approximately 6 months.
- Punctuality: Data is disseminated punctually in particular to fulfill requirements of yearly dissemination of the SDG indicators.
Comparability:
- Comparability - geographical: Given the number of countries for which AQUASTAT reports information as well as the heterogeneity), data is sometimes not highly comparable. AQUASTAT does attempt to maximize comparability through rigorous quality assurance exercises, for example, by applying a common set of rules with which to evaluate transboundary water resources.
- Comparability - over time: Time series, unless otherwise noted are comparable across time.
Coherence:
- Coherence - cross domain: Harmonization exercises with other agencies are performed frequently and substantial differences are immediately corrected in the AQUASTAT database. Nomenclature for statistics offered by other agencies is not always consistent with AQUASTAT nomenclature. In these cases, the AQUASTAT glossary defines synonyms.
- Coherence - internal: Incoherent data is adjusted by AQUASTAT in order to maintain internal coherence. Adjusted values are attributed with an appropriate qualifier (also known as flag or symbol).
Data revision:
- Data revision - policy: Data is revised as soon as high quality updates become available. If necessary, present day changes cause back-changing and/or deleting of historic entries.
- Data revision - practice: N/A
Statistical processing:
- Source data: Unless otherwise indicated, data come from government reports and publications from within each respective country. Data not generated by a country is displayed with an appropriate qualifier.
- Frequency of data collection: Yearly, but continuous if additional data becomes available in between.
- Data collection methods: Most data is obtained by the use of the FAO-AQUASTAT questionnaire on water and agriculture. This questionnaire is dispatched to AQUASTAT National Correspondents who have been officially nominated by their respective government.
- Data validation: As described in the AQUASTAT methodology section, various steps of validation are performed. The first two levels of validation include human interaction, and are done based on cross-comparison with similar countries as well as historic data for the country in question. The last validation step, prior to saving of new results, is an automated validation step that mathematically checks updated data for consistency and correctness. Almost 200 validation rules are used by this automated validation routine.
- Data compilation: Calculation rules are predefined and use data referring to the same year to generate aggregate values.
- Adjustments: Since national-level data is frequently tailored to be useful at a national level and not for international comparisons, data may be manipulated in order to maximize international comparability. Adjusted data is displayed with an appropriate qualifier. Additionally, data is rounded according to the following methodology:
- 1 111.11 gets rounded to 1111
- 111.111 gets rounded to 111.1
- 11.1111 gets rounded to 11.11
- 1.11111 gets rounded to 1.111
- 0.111111 gets rounded to 0.1111
- 0.0111111 gets rounded to 0.0111
- 0.00111111 gets rounded to 0.0011
- 0.000111111 gets rounded to 0.0001
- smaller values are rounded to the first non-zero value. For example:
0.0000111111 gets rounded to 0.00001