We ran an experiment testing the training rules technique using 100 examples for 5 semantic relationships. It produced around 10 classification rules. The experimental results using these rules as well as expert-defined rules, noun phrase analysis, and WordNet Alignment are shown in Table 4.
Table 4 The experimental results classified by relationship
Relationship |
No. |
No. of refinement |
Expert-defined rules |
NP Analysis |
WordNet Alignment |
Training Rules |
||||
No. |
PC(%) |
No. |
PC(%) |
No. |
PC(%) |
No. |
PC(%) |
|||
BT/NT |
32176 |
21072 |
16587 |
100% |
2062 |
95% |
2423 |
95% |
** |
** |
USE/UF |
21605 |
3553 |
- |
- |
- |
- |
3553 |
70% |
** |
** |
RT |
27589 |
1420 |
622* |
100%* |
- |
- |
- |
- |
798* |
72%* |
Total |
81370 |
26045 |
17209 |
100% |
2062 |
95% |
5976 |
80% |
798* |
72%* |
Remarks: - indicates this technique can not revise this relationship, * indicates the experiment is run with some data, ** indicates the experiment is in initial state.
Based on an expert's review of a small sample of data, some initial rough estimates were made regarding the precision of the methods. The precision of the Expert-defined Rules technique was estimated to be around 100% and 95% correctness for NP Analysis. The WordNet Alignment technique was estimated to be lower, about 94% precision, because some synonym relationships in WordNet should be replaced with the 'abbreviationof' relationship. For example, in AMP <synonym> Adenosine monophosphate, <abbreviatonof> should be used. The precision of the Training Rules technique was estimated to be about 72%. Sources of error include ambiguity in concept classes used as arguments for a given rule, such as the following, 'If class X is food#1 and class Y is food#1, and X RT Y, then X <usedToMake> Y' where, because X and Y belong to the same concept class, the system cannot distinguish between X and Y and may generate erroneous relationships, e.g., pork <usedToMake> hams, and hams <usedToMake> pork. These cases can be revised only by the expert.
There are remain around 55325 unrevised relationships and we will revise only half of them because the inverse relationships will be automatically set. We plan to finish revision in one year with four experts.