Previous Page Table of Contents Next Page


5 Conclusion


Our results clearly show that the ability to recognize compound words drastically improves the results. Manual inspection of the pruned ontologies also shows that generic corpora closely related to the intended target domain such as AG leads to a bigger upper-level of the ontology, i.e. allow to generalize the resulting ontology.

The evaluation has been based on the largest resulting ontology, which has been automatically extracted from the ontology, given the used parameter variations.

It would be interesting to see, if the largest pruned ontology actually contains all concepts that are identified by an exhaustive manual assessment of the input ontology itself. Given the restrictions of time and cost, however, this is unrealistic. A first empirical manual assessment [5] has shown that a generic document set, which represents the surrounding area of the target domain (here the AG set), succeeds in identifying more of the non-relevant concepts. This higher rate could hand only be achieved on a higher total cost of losing a larger set of domain relevant concepts.

In conclusion, no clear statement can be derived concerning an optimal parameter setting. If the aim is to extract possibly all relevant information from the source ontology, then the best approach is to apply the pruner with the least restrictive parameter setting and then further assess the result by subject experts. If, however, subject experts are not available and the goal is to rather retrieve a subset of the source ontology, which includes the least possible amount of irrelevant concepts, even on risk of loosing valuable concepts, then a more restrictive set of parameters should be chosen.

The experience collected with using different generic corpora, shows that a slightly different compilation of the document sets might leads to different results. For our application it might therefore be interesting to identify three different domain document sets representing each sub areas of the target application, viz. food safety, animal health and plant health, separately and apply them to the pruner in separate evaluation runs, later merging the resulting ontologies. In further work, this evaluation should be applied in different domains, in order to see if the statements and conclusions derived above still hold.

Acknowledgements We thank the UN FAO, Rome, for the funding of our work and substantial contribution in requirements analysis and compilation of test document sets. We also thank Andreas Hotho (AIFB, University of Karlsruhe) and Boris Motik (FZI) for their sound technical guidance throughout this project.


Previous Page Top of Page Next Page