Previous Page Table of Contents Next Page


Compilation and scrutiny of food composition data

K. Cashel

Karen Cashel is a Senior Nutritionist, Department of Community Services & Health, PO Box 9848, Canberra ACT 2601

The compilation of food composition data into tables or a data base is a process that begins with defining the aims, objectives and uses of the food composition data base. The result is a specialised, documented data base tailored to meet these defined specifications. Compilation is a complex task involving the collection of data; the commissioning of food analytical programs; the documentation and storage of data for each food from each source; the search for information necessary to compile these individual data appropriately and meaningfully into representative food items; the assembly of these into a data base to produce derived values, documentation and description of items for the user; and the preparation of derived data bases for specific user groups. At every step in this process the data are scrutinised for acceptability against data quality criteria specified for the data base by the compiler (Southgate & Greenfield 1988).

This data scrutiny is itself a process requiring a knowledge of the factors that influence foods and their composition including laboratory procedures and analytical methods, the use of statistical and research techniques as well as data comparison procedures.

This review of the factors considered during data scrutiny and compilation is based on the experience of producing Composition of foods, Australia (COFA) (Cashel & others 1989) and NUTTAB89 (Department of Community Services & Health 1989). The programs for these publications have been outlined previously (Cashel 1989, English 1990).

At the time the Australian food composition program began there were few published data on Australian foods available in the literature. The Australian program was therefore built around a commissioned analytical program aimed at producing, within practical and financial constraints, the most representative data possible. This required particular attention to sampling procedures.

Sampling procedures for representative data

Since the sampling procedure aimed to ensure that the food sample analysed was reasonably representative of the food item available to the consumer for consumption, consideration was given to geographical density of the population, where food is produced and purchased and the factors that might influence the variability of the nutrients in the food, at the time of purchase, and during and after sampling. Many factors influence nutrient content, but not all of these factors apply to all types of food. A flexible system is therefore necessary and for each new food type a specific sampling protocol is developed.

In Australia while regional variation in foods can be expected, sampling across the capital cities of a number of States does not necessarily provide the most appropriate or representative food sample. Many primary products and manufactured foods are produced in a small number of centres and then nationally distributed, and so do not reflect the local growing conditions around each city, eg mangoes and other tropical fruit. Further, nutrient changes may be introduced into fresh produce during packing, storage and transport to the central analysing laboratory resulting in data which do not reflect the nutrient content of foods as purchased by consumers. These and other considerations including budget limitations resulted in much of the sampling being undertaken in the region of the analysing laboratory.

The aim was to obtain a sufficiently wide range of samples from differing batches to obtain a representative nutrient profile for the product. For processed and manufactured foods, where possible, varieties or brands sampled are based on national market shares while for some foods, sampling from different production runs was more important. Samples were purchased from consumer preferred retail outlets by an unidentified purchaser. For some of the fruit and vegetable items these outlets included major markets to obtain exotic produce, to assist with identifying cultivars and to allow sampling from different growing regions. Foods were analysed within their product shelf-life, as defined by use-by-dates. For most foods a composite sample was prepared for analysis from multiple individual samples.

Laboratory programs

Data scrutiny begins back at the laboratory. The demands made from a project of this size and complexity are immense and each laboratory involved has needed 18–24 months to set up methods, solve problems and develop expertise in all aspects of the program. Far more is expected of the laboratory than just analysing a food sample. The laboratory must collect samples to a written protocol and must provide a range of information about the item analysed including purchase date, purchase site, sample descriptions and copies of labels, sample sizes, handling, gross composition, edible portion, analytical sample preparation, weight changes, analytical methods used as well as the nutrient levels determined. The submitted laboratory reports are scrutinised in the Nutrition Section of the Department and queries are referred both to program advisers and back to the laboratories. This may continue for months, even years, and may require re-analysis and sometimes re-sampling to finalise the analytical data.

Factors in data scrutiny

The new data were drawn from the food analytical program, from published articles on Australian foods, from the food industry and from unpublished research. A number of factors, discussed below, determine not only the suitability and usability of the data but also correct interpretation of the data. In addition overseas food composition tables and papers were consulted to provide a global context for data comparison and validation.

The food description is vital information for any user. It must clearly identify the food item analysed and should therefore include an objective identification, ie scientific name, components of mixed foods; physical state, ie raw, cooked or processed; and cooking or processing description. All too often the food descriptor consists of ‘mango’, ‘hamburger’, or ‘mince meat’ and the reader is left to guess the form of the food analysed (raw, canned, dried), the animal (or other) food base of the ‘mince meat’; whether the mango was peeled or unpeeled; which components made up the ‘hamburger’ and, or course, the exact meaning of that useful descriptor ‘cooked’. Misuse of such terms can be highly misleading. For example the data for both battered and fried oysters and shrimps in Thomas and Corden (1977) are identical to those for breaded and fried oysters and shrimps in the 1963 edition of the US tables (USDA 1963). However, not only are the composition of fried battered foods and fried breaded foods different due to differing proportions of coating to food and to differing fat absorption levels (Makinson & others 1987) but the food called shrimp in the US is commonly called prawn in Australia. The use of the term ‘shrimp’ in Australia suggests a very different sized unit of food and a different flesh to coating ratio. This kind of factor has to be considered when evaluating data against data from other countries.

If the food is processed, consideration must be given to whether its composition is affected by food legislation which may vary both by State and with time. This information is pertinent not just to the relevance of the data but also to checking the accuracy of the analytical values eg margarines and the meaning of terms such as ‘table’, ‘polyunsaturated’.

The sampling description required includes whether the food was as generally available to consumers; whether the food was selected to specified criteria and, if so, what these were; when and where the food was purchased or obtained; how many samples of what size were obtained, and, if relevant, brand names.

This highly pertinent information assists to identify season, region of origin, market availability, consumer availability as well as nutrient implications. For example, the vitamin C content of a fresh-picked apple analysed immediately is inappropriate to represent the vitamin C content of applies stored for more than three days regardless of whether the apple is retail or home produced. The vitamin C content of a fresh-picked apple falls to a stable level after picking (Wills & El-Ghetany 1986). Data for fresh-picked apples would therefore be misleading if used to represent apples as available to the vast majority of the population. Similarly data from a study of vitamin C in oranges grown on experimental root stocks may be quite inappropriate relative to the commercially available fruit.

Further, without cultivar identification, local data on the potato, for example, would have been difficult to evaluate in a global context.

The edible portion information is vital to knowing to what the analytical data relate. The compiler needs to know which components of the food were included in the analytical sample; which components were rejected; and what the proporations of these were. This information assists both in further identifying the food as purchased and the food as analysed. All too often this information is not given or only the edible portion factor is given, usually as a percentage. The edible portion can at least provide some basis for assessing what proportion and, to a lesser extent, what part of the food is considered as edible. Different cultures may eat different parts of a food and this may change with time. For example, the fibre debate has affected what is commonly considered as ‘edible’ in our community. Also, the food as purchased may change with time and, without an objective description the edible portion, expressed only as a percentage, will be quite inappropriate and even misleading to the user. For example, fresh carrots used to be marketed in bunches with their tops resulting in a lower edible portion value than that of current market-trimmed carrots. Similarly, for meat chops changes in retail trimming practices affect the proportions of edible flesh to bone in the food as available.

Analytical sample information is required to identify whether a single or composite sample were formed (Greenfield & Southgate 1985) and, if composite, the number of units used and the proportions in which they were combined. The preparation of the analytical sample indicates the approach taken to obtain data representative of the food as purchased.

Laboratory handling and preparation information required by the compiler includes how the food sample was handled and stored both in the laboratory and, if transported significant distances, during transport; how the analytical sample was prepared and stored; and when and in what order the analyses were carried out. Sample handling and storage should be consistent with, at a minimum, usual home practices, ie consistent with manufacturer's instructions and appropriate to the food. If the food was transported, eg from interstate, details of how the food was prepared and packaged and information about the time lag between purchase and laboratory delivery must be available. Obviously some nutrients are affected by these handling practices.

Moisture and vitamin C analyses, in particular, should be performed as soon after receipt of the food as possible. Further, for a food that must undergo significant transport and time delays, weighing of the food at the time of purchase and packaging, and again at the laboratory provides an indication of the degree of physical change in the food. As many foods have a high moisture content small changes can have a large effect on the final expression of the nutrient concentration of a food as originally obtained. For other nutrients, inappropriate handling and storage may produce artifacts in the data resulting in both losses and apparent gains in nutrients. Factors affecting loss of nutrients such as the effect of ultraviolet light on riboflavin levels in milk are well known, but certain preparation and analytical method combinations may also result in apparent gains. The freeze-drying of a high starch food will result in the formation of resistant starch which, when the dietary fibre content is determined by the current AOAC recommended method will give an apparently higher value for total dietary fibre than exists in the fresh food item (Bingham 1987, Englyst & others 1987).

The analytical method used must be valid, appropriate to the food and identify all required components of the nutrient. Analytical technology and associated methods are continually evolving as knowledge improves about the active forms of a nutrient and the specificity with which they can be quantified. For some nutrients, eg dietary fibre, vitamin A, this naturally leads to a healthy debate about the components which should be included in the generic name of the nutrient and their relative physiological activities.

Significant inconsistencies are seen in reported levels of nutrients due to the use of differing analytical methods which vary in the active principles they measure. For example, the method used to analyse vitamin C and accepted for the official US tables measures only ascorbic acid (USDA 1976–1989) while that in the Australian (Cashel & others 1989) and British (Paul & Southgate 1978) tables measures both ascorbic and dehydroascorbic acid. Further, the US food composition tables do not include analytical data for carbohydrate but data calculated ‘by difference’. Most food industry laboratories also prefer the ‘by difference’ method. Such ‘measures’ include dietary fibre as part of the carbohydrate measure and affect the factors used to calculate energy. Carbohydrate and its components may be reported as monosaccharide equivalents (eg Paul & Southgate 1978) or as grams of component (eg Cashel & others 1989). Nutrients determined and methods of expression can vary even within a food composition table, especially when released in a series over a number of years (eg dietary fibre data in both the US tables (USDA 1976–89) and UK tables (Paul & Southgate 1978, Holland & others 1988). For some nutrients eg folate, the debate over the revelance of the data by existing methods makes suspect all the available data. For others, eg vitamin A, the new analytical technologies challenge previous assumptions about the isomers being measured and their biological relevance. These factors must all be considered when scrutinising data against comparable foods from elsewhere in the world.

Quality control and mode of expression must also be considered before analytical values, from any laboratory, should be accepted as correct. Data such as range, standard deviation and coefficient of variation must be used with caution as these may relate to biological variability or seasonal variability of the nutrient or to intra- and inter-analyst, laboratory or method variability.

There are internal consistency checks for reliability such as the sum of fatty acids and proximates to assess consistency of relative levels of nutrients in data for the same food. Moisture has already been identified as a critical nutrient in terms of expressed levels of nutrients on a wet weight basis. Apparent differences in data may be due to differing levels of hydration. Moisture too may be the only indicator of a dried or partially dried food. In published data it is also important to check the factors used for determining calculated values, eg nitrogen factors, fat factors, energy factors.

The data source is generally obvious when dealing with analytical research papers and laboratory reports. For data from food composition tables, nutrition texts, consumer reports and food industry information the source may not be clear and yet is critical to interpreting the data and their usefulness. Food industry data are often a combination of limited analyses and values calculated from product formulations. The analytical data may be local or from the parent company overseas. The rest of the data may be derived from any number of sources, the most popular being the previous official Australian tables (Thomas & Corden 1977), the US tables and the UK tables. Many of the food tables of the world, including Thomas and Corden (1977), use or are based on the US tables with or without specific referencing and without adaptation except for changing the food name to the local preferred name. Nutrition texts can use one or more sources, usually a major national table, sometimes with a small proportion of local data and even with selective and arbitrary changes to one or more nutrients (eg Briggs & Wahlqvist 1984).

If sources are not specified the user has no information at all on the reliability of the data, their Australian relevance, the analytical methods used (and thus the components measured) or the currency of the data and cannot go back to the primary source or the analyst for more information. Lack of adequate referencing is a breach of scientific conventions and source acknowledgement and even of copyright. Food composition data are a national and international resource and their integrity needs to be recognised, valued and protected.

Compiling food composition data

The process of compiling food composition data into the data base aims at ensuring internal consistency and compatibility of data and requires the steps described below.

Repeating, extending or rejecting data analyses refers to data needed to provide the minimum range of nutrients necessary for inclusion in the tables and/or for checking on data that the compiler questions as a result of the scrutiny process.

For example, the Australian meat data as published by the researchers (Greenfield 1987) are not necessarily identical to those in COFA. Data from one group could not be used because of uncertainty about the representivity of the meat samples analysed, while fatty acid data from another group were no longer adequate as the capillary gas chromatography technique became preferred. Therefore, a separate program commissioned later determined the fatty acid content of beef by capillary gas chromatography and provided further gross composition and other nutrient data. These data were then combined with the data from the initial program for inclusion in the new tables.

Data may be rejected if the method used was incompatible with the specified or validated methods. For example, the data identified as ‘residue’ ‘fibre’ and ‘dietary fibre’ in the Greenfield and Wills published papers (Greenfield & others 1981, Wills & El-Ghetany 1986) were not included in COFA as the method was not validated. The modified Englyst method (Englyst & others 1982, Jones & others 1985) was initially used in the commissioned Australian program and the AOAC method (Prosky & others 1985) is now used.

These two methods are not compatible and do not measure the same components of dietary fibre. The Englyst method is highly specific and measures chemical defined components of dietary fibre and excludes resistant starch and substances measuring as lignin from its reported total dietary fibre values. The AOAC method provides a measure of total dietary fibre (TDF) that includes resistant starch and lignin. The modification to provide some measure of soluble and insoluble fractions is also used in the Australian program (Prosky & others 1988). The shift from the Englyst to the AOAC method for use in COFA was a pragmatic decision based on factors such as regular availability of analytical resources and the cost of analysis and is regularly reviewed in the light of the current scientific debate.

Levels of significant figures have recognised and recommended conventions for nutrient data presentation (Greenfield & Southgate 1990). Analysts may, however, report data to a higher or lower number of significant figures thus necessitating in some cases the rounding of data for tables. Most computer generated tables, eg the US tables, have a fixed level of reporting with most nutrients at a standard two or three decimal places. The Australian Nutrient Data Bank system specifies a level of reporting for each nutrient at the time of printing but does not yet have the facility to round selectively to the nearest five or ten units, for example.

Calculation factors for generating data on some nutrient components eg energy, or for adjusting to quantity per unit of food using fat and nitrogen factors are generally based on internationally recognised conventions (Paul & Southgate 1978). The literature is monitored as these factors can be affected by developments in analytical methods.

Changes to the food may have occurred since the food item analysed was purchased. For the commissioned Australian programs the delay between the decision to analyse a food and publishing the data should be less than three years but has been up to eight years for COFA. The interval comprises the time needed to research available products and to draw up a sampling protocol (usually, 6–8 weeks); the food analysis program (usually 12 months, especially if foods are seasonal); laboratory checking and reporting (usually 6–12 months); data scrutiny, evaluation and preparation (6–12 months per program); and the publication process (now about 3 months). The article in a scientific journal may be a little younger (date of food sampling should be sought). Even in this relatively short time there may have been changes in a food item due to changes to the fortification formula, ingredients, food legislation, additives (eg β-carotene being used as a food colour), wholesale or retail practices (eg butchering) or agricultural practices (eg preferred carcase composition or major plant cultivar). Such changes must be monitored, and assessed and the data handled accordingly in order to represent the food accurately in the published tables. There may be a need for reanalysis of some nutrients, eg β-carotene, or to modify the way the food is described. For example, an extruded snack product may stay the same in composition but change in size or shape of the pieces or in name and package size of the product.

The best representation of a food in the tables requires considerable thought, discussion and research. The arithmetic mean of all analytical values for a nutrient in one food does not necessarily produce an appropriate representation of the food as available to the consumer. Some of the data may be appropriate to a specific growing region, season, or factory output, or to selected batches of a processed product. Other data may be from a composite of samples more broadly drawn from the food supply. Allowance for these differences may result in weighting the data to allow for such differences or the use of footnotes to highlight specific seasonal or regional differences. Similarly if there are data on individual brands or cultivars of a food these may be weighted on the basis of market share or production ratios. This is usually done if the major basis for identification of the food to the consumer is generic, eg apricot, baked bean, milk coffee biscuits or cheddar cheese. For others, such as apples, both approaches may be taken, ie data provided on individual cultivars (Jonathon, Granny Smith etc) plus generic ‘apple’ from weighted means of the individual cultivars.

The potential use of the data is considered when making such decisions. For example, a major use of food composition data is in dietary studies where subjects report the foods consumed. The likely ability of the consumer to identify a food (eg by cultivar) at the time of purchase or consumption is considered in presenting data. Also, if a statistical analysis of different data on a range of shaped, sized and branded extruded snack products shows no difference in their analytical values then the inclusion of a generic named item in the food table provides useful and reliable data to users that allows for the cosmetic differences.

Terminology development to describe the foods sampled and the parts actually analysed is a significant aspect of preparing data for users, as is grouping the foods for location and index purposes in tables. The aims are to develop criteria to allocate foods objectively to a grouping; to use terms unambiguously and consistently to describe the food, the edible and inedible parts and the basis of the sampling; to describe the analytical sample; and to describe the analytes and the analytical methods used. Working descriptors and definitions have been developed for these specific purposes in the new Australian food tables and are described and defined in the text.

Gross composition and food measures are critical to providing nutrient information in practical terms for users. Gross composition data, that is, ingredients and proportions of ingredients in mixed foods or components and proportions of edible and inedible parts of fresh or single ingredient foods, are regularly monitored. Changes to these may affect edible portions (eg carrot tops, butchering techniques), nutrient composition (butchering techniques and practices), weight changes associated with cooking and meaningful description of the food as available.

Where possible sampling protocols and the details recorded for food samples anticipate that such changes occur. For example, the muscle tissue and separable fat components were analysed separately for meats and details of the proportions of muscle, separable fat and bone recorded (Greenfield 1987). This allows for representation of meat cuts at different levels of fat trimming and for revision of nutrient data of meat cuts as available retail based on gross composition data, without further expensive analytical work.

Full descriptions and associated weights of measures of foods, particularly standard measures of foods, are part of the food analytical program. Such measures are routinely collected as part of the laboratory reports and a funded three year weight for defined measure program was commissioned to obtain such information (English 1990). Other investigations of these measures from a variety of sources have also been undertaken.

Correct understanding and use of food composition data

It should be clear from this account that compiling national food tables involves more than cataloguing numbers collected through the commissioned analytical programs or other sources and wrapping a cover around them, and that the data provided are not stagnant and unchanging. Modern users are generally far removed from the production and compilation of the basic composition data, and, in addition, the ease of ‘plugging’ into a computer and generating instant analysis of food and diets further seduces their questioning facilities. Users must be prepared to become more knowledgeable by reading the explanatory text acompanying food composition tables or a data base. In addition they must be prepared to become better informed about the background to the production and compilation of food composition data as this will enhance the quality and appropriateness of the uses to which such data are put in the scientific practice of nutrition and dietetics.

References

Bingham S. 1987. Definitions and intakes of dietary fibre. Am. J. Clin. Nutr. 45: 1226–31.

Briggs, D & Wahlqvist, M. 1984. Food facts. Ringwood: Penguin Books Australia Ltd.

Cashel, K. 1989. Australian nutrient data tables. Food Aust. 41: 1034–5.

Cashel, K, English, R. & Lewis, J. 1989. Composition of foods, Australia. Canberra: AGPS.

Commonwealth Department of Community Services & Health. 1989. NUTTAB89. Nutrient data table for use in Australia. Disk format. Canberra: AGPS.

English, R. 1990. Composition of foods, Australia. Food Aust. 42: 8, 55–57.

Englyst, H, Trowell, H, Southgate, DAT & Cummings, J. 1987. Dietary fibre and resistent starch. Am. J. Clin. Nutr. 46: 873–4.

Englyst, H, Wiggins, HS & Cummings, JH. 1982. Determination of the non-starch polysaccharides in plant foods by gas-liquid chromatography of constituent sugars as alditol acetates. Analyst 107: 307–18.

Greenfield, H (ed). 1987. The nutrient composition of Australian meats and poultry. Food Technol. Aust. 39: 181–240.

Greenfield, H, Lee, YH & Wills, RBH. 1981. Composition of foods. 11. Muesli. Food Technol. Aust. 33: 564–8.

Greenfield, H & Southgate, DAT. 1985. A pragmatic approach to the production of good quality food composition data. ASEAN Food J. 1: 47–54.

Greenfield, H & Southgate, DAT. 1990. Guidelines for the production, management and use of food composition data. An INFOODS manual (in press).

Holland, B, Unwin, ID and Buss, DH. 1988. Cereal and cereal products. London: The Royal Society of Chemistry.

Jones, GP, Briggs, DR, Wahlqvist, ML & Flentje, LM. 1985. Dietary fibre content of Australian foods. 1. Potatoes. Food Technol. Aust. 37: 81–3.

Makinson, JH, Greenfield, H, Wong, Ml & Wills, RBH. 1987. Fat uptake during deep-fat frying of coated and uncoated foods. J. Food Comp. Anal. 1: 93–101.

Paul, AA & Southgate, DAT. 1978. The composition of foods. London: HMSO.

Prosky, L, Asp, N-G, Furda, 1, De Vries, JW, Schweizer, TF & Harland, BF. 1985. Determination of total dietary fibre in foods and food products: collaborative study. J. Assoc. Off. Anal. Chem. 68: 677–9.

Prosky, L, Asp, N-G, Schweizer, TF, De Vries, JW & Furda, I. 1988. Determination of insoluble, soluble and total dietary fibre in foods and food products: Interlaboratory study. J. Assoc. Off. Anal. Chem. 71: 1017–23.

Southgate, DAT & Greenfield, H. 1988. Guidelines for the production, management and use of food composition data: an Infoods project. Food Sci. Nutr. 42F: 15–23.

Thomas, S & Corden, M. 1977. Metric tables of composition of Australia foods. Canberra: AGPS.

US Department of Agriculture. 1963. Composition of foods. Raw, processed, prepared. Handbook 8. Washington DC: USDA.

US Department of Agriculture. 1976–1989. Composition of foods. Raw, processed, prepared. Handbook 8 series. Washington DC: USDA.

Wills, RBH & El-Ghetany, Y. 1986. Composition of Australian foods. 30. Apples and pears. Food Technol. Aust. 38: 77–82.


Previous Page Top of Page Next Page