Previous Page Table of Contents Next Page


3. AGRIS


The AGRIS database contains 3,000,000 bibliographic records. The database itself is split into two sections: the Current section (records entered from 1996-) and the Archive (1975-1995). The system collects metadata for conventional (journal articles, books) and non-conventional materials (sometimes called "grey literature" e.g. theses, reports, etc.) that are not available through ordinary commercial channels. One of the main reasons for the existence of AGRIS is to encourage the exchange of information among developing countries, whose literature would not be covered by other international systems.

3.1. Method

The method used was essentially the same as that used in the first test, except we decided to use only exact title searches in Google, the free text option being seen as less useful, especially for searching citations. Therefore, exact titles were searched, limited to resources in English, French and Spanish, and no generic titles. Hits were counted as those being in the top 10, or if there was a link to the resource from result number one.

We took a random sample of AGRIS records by generating 500 random accessions numbers. Half of the records were in the Current section, and the other half in the Archive. We had no idea what to expect concerning the total number of resources found on the Web. The popularity of placing resources on the Internet did not really take off until 1996, so we hypothesized that more resources would be found in the Current section. Although the Archive contains older materials, and probably has fewer resources online, it could be a rich source of citations to find related documents.

3.2. AGRIS Record Structure & Practice

AGRIS records (see Figure 4) are created according to the rules published in the AGRIS Cataloguing Rules [11]. The title chosen for research here was "Original Title" since it is the title that appears on the resource. The AGRIS database primarily comprises records of scientific and technical articles. This ensures that titles are normally unique.

Accession Number:

97-001686

Title:

Effect of hot water in the germination of Leucaena leucocephala cv."Cunningham".

Original Title:

Efecto del agua caliente en la germinacion de Leucaena leucocephala cv. "Cunningham".

Publication Year:

1995

Subject Category:

Seed production and processing;

Author:

Gonzalez, Y.;Mendoza, F.

ISSN:

ISSN 0864-0394.

Bibliographic Source:

6 tablas; 8 ref. Pastos y Forrajes (Cuba). (1995). v. 18(1) p. 59-65.

Summary lang:

EN

ES

AGROVOC keywords:

English:

leucaena leucocephala; germination; seed; seed storage; water; temperature;

French:

leucaena leucocephala; germination; semence; stockage des semences; eau; temperature;

Spanish:

leucaena leucocephala; germinacion; semillas; almacenamiento de semillas; agua; temperatura;

Figure 4. Sample AGRIS Record

The results of the random accessions numbers led to the following distribution of languages: of 500 records, 338 were in English, 155 in Spanish, but only 7 in French[5] (see Figure 5).

Figure 5. Searches by language

3.3. Analysis of results from AGRIS

The results from the AGRIS test were highly positive. It turned out that the method of searching exact titles in Google resulted in finding 137 of 500 records, or a 27.4% success rate. Just as interesting is the success rate for finding citations: 222 records from 500 found citations to the resource, or 44.4%.

Figure 6. Trend in successful search results by Year

Figure 6 shows the distribution of records searched and resources found by year. The majority of resources found date from 1991. This indicates that there is an effort to put resources on the Internet retrospectively.

If these numbers should hold true for the entire AGRIS database of 3,000,000 records, it would mean that around 840,000 records would have immediate access to the full-text, while 1,400,000 records would provide citations to later documents - all of this without any human intervention. These percentages can only grow with time.

Figure 7. Reduction of noise in search results.

A successful result meant that the resource could be found in the first ten results on Google. Only 18 cases (Figure 7) resulted in an excessive amount of noise, often due to generic titles, e.g. "Dressing and sauces". When we searched these resources again, this time adding just the year of publication and first author, the noise dropped significantly, in one case going from 285,000 to 1. Some results remained above 50, but in these cases, the articles were important and cited many times.


[5] A follow-up project could concentrate only on French records.

Previous Page Top of Page Next Page