PDF Ebook Validation of Expert System Performance

Submitted by antoq on Wed, 12/16/2009 - 06:21

Most definitions of an expert system include some reference to the ability of the system to perform at a level close to human expert performance. Yet the validation of expert systems, that is, the testing of systems so as to ascertain that they achieve an acceptable level of performance, has (with a few exceptions) been ad-hoc, informal, and in some cases of dubious value. This paper attempts to establish validation as an important concern in expert systems research and development. The problems in validating an expert system are discussed, and a number of methods for validating expert systems, both qualitative and quantitative, are presented.

Most débitions of an expert system (ES) include some reference to the ability of the system to perform at a level close to human expert performance. Yet the validation of expert systems, that is, the testing of systems so as to ascertain that they achieve an acceptable level of performance, has (with a few exceptions) been ad-hoc, informal, and in some cases of dubious value.

Typically, the performance of an ES has been validated by running a number of test cases through the system, and comparing the "result" (i.e., the classification, final certainty factors, the advice given, or whatever) from the system against either known results or expert opinion. A percentage is calculated for the success rate of the system, and subjective judgement is used to both analyse this and explain the failure of the ES where its result was in contradiction to the known result or expert opinion. Examples of this approach span from an early validation of MYCIN [1], to a recently reported validation of a chest pain diagnosis system called EMERGE [2], This simple approach presents a number of problems. The final percentage obtained is a function of the choice of test cases, and its accuracy is a function of the number of test cases. Where the system is compared against the expert on whose knowledge the system is built, as happened with PROSPECTOR [3], the value of the so-called validation is dubious.

As many developed expert systems have been research prototypes, the purpose of the validation has often been to qualitatively measure system performance (e.g., see Miller et al. [4] on the medical diagnosis system INTERNIST-1), or validation has simply been part of an overall evaluation aimed at assessing the value of an ES to a particular domain (e.g., see Kulikowski and Weiss [5] on the medical diagnosis system CASNET, or Hansen and Messier [6] on the auditing ES EDP-XPERT). However, the performance of expert systems that are to be used on a regular basis, particularly in critical areas, must be validated very carefully. Thus an increase in the use of formal validation methods can be seen in the development of some implemented systems. Both a later validation of MYCIN [7j, and a validation of the chemotherapy adviser ONCOCIN [8], like MYCIN developed at Stanford, used formal methods backed up by statistical tests. The performance of the VAX configuration system Rl/XCON also underwent some elements of forma! validation (see Bâchant and McDermott [9], and the discussion in Gaschnig et al. [10]).

Download
PDF Ebook Validation of Expert System Performance


Posted in :