Previous Page Table of Contents Next Page


5. The Detection and Suggestion Module: An Algorithm for Term Relationship Revision


5.1 Overview of the algorithm

In this module, the system detects incorrect and inconsistently applied relationships and suggests the appropriate relationships for expert confirmation. We propose three techniques to handle this process: semantic relationship rules, noun phrase analysis, and WordNet alignment.

The outline of this algorithm is illustrated in Fig. 5, where T1, T2 and Rel denote, respectively, Term1, Term2, and the AGROVOC relationship between them.

AGROVOC Cleaning_& Refinement (T1, T2, Rel)
Input: Term1, Term2, Relationship
Output: New Relationship

;Return new__relationship

1. If (Rel = BT or Rel = NT)




Then If Agree_Expert_defined_Rules (T1, T2, Rel)




Then return new_refined_relationship.

; following the rules



Else If Headword-Is-Compatible (T1, T2)





Then return subclass/superclass relationship.





Else If Is_Wordnet_HypernymPath (T1,T2)






Then return subclass/superclass relationship.






Else If Agree_Revision_Rules (T1, T2, Rel)







Then return new_relationship

; following the rules






Else return U.

; Un-refined



2. Else If (Rel=UF or Rel = USE)



Then If Is_Wordnet_Synset (T1, T2)




Then return synonym relationship.




Else If Agree_Revision_Rules (T1, T2, Rel)





Then return new_relationship.

; following the rules




Else return U.

; Un-refined



3. Else If (Rel=RT)



Then If Agree_Revision_Rules (T1, T2, Rel)




Then return new_relationship.

; following the rules



Else return U.

; Un-refined

Fig. 5 An Algorithm for Data Cleaning and Relationship Refinement

The relationship revision rules have been discussed in Section 4. Section 5.2 briefly describes the procedures based on noun phrase analysis and WordNet alignment, and Section 5.3 describes the verification tool.

5.2 Noun phrase analysis and WordNet alignment

The noun phrase analysis technique is used to analyze the surface form of a compound term's head word. If the head word of a term has the same surface form as its broader term, the system will apply the 'subclassOf'/'superclassOf' relationship to them. The system analyzes compound nouns using the following rule:

NP -> MOD NCN

MOD -> NCN, NPN, ADJ, ...

Where MOD is a modifier, NCN is Common Noun, NPN is a proper name, ADJ is an adjective

For example,

From the compound noun analysis, the head word of Cow milk is milk which has the same surface form as Milk, the broader term of Cow milk. Then, the system will apply the <subclassOf> relationship to Cow milk and Milk. The result of the analysis shows that the head word of Milk fat is fat, which is not compatible with the broader term, Milk. In this case, other techniques must be used to refine the relationship.

In this step, the hypernym/hyponym relationships of WordNet are used to align the BT/NT relationship in AGROVOC, and the synset of a term in WordNet is used to align the UF/USE relationship in AGROVOC. Since the relationships in WordNet are verified by experts and WordNet contains a great number of general domain terms including agricultural terms, WordNet is a good resource for aligning some AGROVOC relationships such as taxonomic and synonym relationships. (Other verified sources could be used as available, individually or in combination.) The process of this step starts with the system retrieving the synset offset number of the AGROVOC UF/USE term in WordNet. If the system finds these terms and they have the same synset id number, the system will apply the 'synonym' relationship to them. The system will also query the AGROVOC broader term and narrower term in WordNet. If the system finds that the broader term is the ancestor of the narrower term in the WordNet hierarchy, the system will apply the 'subclassOf'/'superclassOf' relationship to them. For example,

Cabbage BT Vegetable

Query results for Cabbage and Vegetable in WordNet show that Cabbage is a hyponym of Cruciferous vegetable and Cruciferous vegetable is a hyponym of Vegetable. Fig. 6 shows the relationship of Vegetable and Cabbage in WordNet and AGROVOC.

Since Vegetable is an ancestor of Cabbage, the system will define Vegetable as superclassOf Cabbage. In the case of Milk NT Milk fat, the relationship is not refined by this technique because Milk and fat are in different hypernym paths in WordNet.

Fig.6. The relationship between Vegetable and Cabbage in WordNet and AGROVOC

5.3 The Verification Tool

After the system has suggested the new relationships for terms, the expert will verify the semantic relationship refinement results and also define the appropriate relationship for the cases that cannot be handled by the system. Fig.7 is the user interface for verifying the output of the system. The expert can verify the terms and relationships by querying by

1. Term to verify each term and its relationships to other terms e.g., rice

2. Semantic relationship e.g., <containsSubstance>

3. Rule e.g., 'If class X is meat#1 and class Y is animal#1, and X RT Y then X <madeFrom> Y'.

Fig.7 Verification Tools


Previous Page Top of Page Next Page