Sunday, November 11, 2007
Artificial intelligence Section C
Expert Systems
Learning Objectives
After reading this unit you should appreciate the following:
Need and Justification of Expert Systems
Knowledge Acquisition
Case Studies
MYCIN
RI
Top
Need and Justification of Expert Systems
This unit describes the basic architecture of knowledge-based systems with emphasis placed on expert systems. Expert systems are recent product of artificial intelligence. They began to emerge as university research systems during the early 1970s. They have now become one of the most important innovations of AI, since they have been shown to be successful commercial products as well as interesting research tools.
Expert systems have been proven to be effective in a number of problem domains, which normally require the kind of intelligence possessed by a human expert. The areas of application are almost endless. Wherever human expertise is needed to solve a problem, expert systems are most likely of the options sought. Application domain includes law, chemistry, biology engineering, manufacturing, aerospace military operations, finance, banking, meteorology, geology, geophysics and more .The list goes on and on.
In this chapter we explore expert system architectures and related building tools. We also look at a few of the more important application areas as well. The material is intended to acquaint the reader with the basic concepts underlying expert system and to provide enough of the fundamentals needed to build basic systems to pursue further studies and conduct research in the area.
An expert system is a set of programs that manipulate encoded knowledge to solve problems in a specialized domain that normally requires human expertise. An expert system’s knowledge is obtained from expert sources and coded in a form suitable for the system to use in inference or reasoning processes. The expert knowledge must be obtained from specialists or other sources of expertise, such as texts, journal articles, and databases. This type of knowledge usually requires much of training and experience in specialized fields such as medicine, geology, system configuration, or engineering design. Once a sufficient body of expert knowledge has been acquired, it must be encoded in some form, into a knowledge base, then tested, and refined continually throughout the life of the system.
Characteristic Features of Expert Systems
Expert systems differ from conventional computer systems in several important ways.
1. Expert systems use knowledge rather than data to control the solution process. “In the knowledge lies the power” is a theme repeatedly followed and supported through this book. Much of the knowledge used is heuristic in nature rather than algorithmic.
2. The knowledge is encoded and maintained as an entity separate from the control program. As such, it is not compiled together with the control program itself. This permits the incremental addition and modification (refinement) of the knowledge base without recompilation of the control programs. Furthermore, it is possible in some cases to use different knowledge bases with the same control programs to produce different types of expert systems. Such systems are known as Expert System Shell, as they may be loaded with different knowledge bases.
3. Expert systems are capable to explain how a particular conclusion was achieved, and why requested information is needed during a consultation. This is important as it gives the user a chance to assess and understand the system’s reasoning ability, thereby improving the user’s confidence in the system.
4. Expert systems use symbolic representations for knowledge (rules, networks, or frames) and perform their inference through symbolic computations that closely resemble manipulations of natural language. (An exception to this is the expert system based on neutral network architectures.)
5. Expert systems often reason with metaknowledge (knowledge about knowledge) also, their own knowledge limits it’s capabilities.
Top
MYCIN
The development of MYCIN began at Stanford University. MYCIN is an expert system, which diagnoses infectious blood diseases and determines a recommended list of therapies for the patient. As part of the Heuristic Programming Project at Stanford, several projects directly related to MYCIN were also completed including a knowledge acquisition component called THEIRESIUS, a tutorial component called GUIDON, and a shell component called EMYCIN (for Essential MYCIN). EMYCIN was used to build other diagnostic systems including PUFF, a diagnostic expert for pulmonary diseases. EMYCIN also became the design model for several commercial expert system building tools.
MYCIN’s performance improved significantly over a period of several year as additional knowledge was added. Tests indicate that MYCIN’ performance now equals or exceeds that of experienced physicians. The initial MYCIN knowledge base contained about only 200 rules. This number was gradually increased to more than 600 rules by the early 1980s. The added rules significantly improved MYCIN’s performance leading to a 65% success record that compared favorably with experienced physicians who demonstrated only an average 60% success rate.
Subgoaling in MYCIN
MYCIN is a heterogeneous program, consisting of many different modules. There is a part of MYCIN's control structure that performs a quasi-diagnostic function. But the goals to be achieved are not physical goals, involving the movement of objects in space, but reasoning goals that involve the establishment of diagnostic hypothesis.
This section concentrates upon the diagnostic module of MYCIN, giving a simplified account of its function, structure and runtime behavior.
Treating blood infections
Firstly, we need to give a brief description of MYCIN's domain: treatment of blood infections. This description pre-supposes no specialized medical knowledge on the part of the reader. But, as with any expert system, having some understanding of the domain is crucial to understand what the program does.
An 'anti-microbial agent' is any drug designed to kill bacteria or arrest their growth. Some agents are too toxic for therapeutic purposes, and there is no single agent effective against all bacteria. The selection of therapy for bacterial infection can be viewed as a four-part decision process:
Deciding if the patient has a significant infection;
Determining the (possible) organism(s) involved;
Selecting a set of drugs that might be appropriate;
Choosing the most appropriate drug or combination of drugs.
Samples taken from the site of infection are sent to a microbiology laboratory for culture, that is, an attempt to grow organisms from the sample in a suitable medium.
Early evidence of growth may allow a report of the morphological or staining characteristics of the organism. However, even if an organism is identified, the range of drugs it is sensitive to, may be unknown or uncertain.
MYCIN is often described as a diagnostic program, but this is not so. Its purpose is to assist a physician who is not an expert in the field of antibiotics with the treatment of blood infections. In doing so, it develops diagnostic hypotheses and weights them, but it need not necessarily choose between them. Work on MYCIN began in 1972 as collaboration between the medical and AI communities at Stanford University. The most complete single account of this work is Short-life (1976).
There have been a number of extensions, revisions and abstractions of MYCIN since 1976, but the basic version has five components shown in the fig. 8.1 which shows the basic pattern of information flow between the modules.
(1} A-knowledgebase which contains factual and judgmental knowledge about the domain.
(2) A dynamic patient database containing information about a particular case.
(3) A consultation program, which asks questions, draws conclusions, and gives advice about a particular case based on the patient data and the static knowledge.
(4) An explanation program, which answers questions and justifies its advice, using static knowledge and a trace of the program’s execution.
(5) A knowledge acquisition program for adding new rules and changing existing ones.
The system consisting of components (l)-(3) is the problem solving pan of MYCIN, which generates hypotheses with respect to the offending organisms, and makes therapy recommendations based on these hypotheses.
Figure 8.1: Organization of MYCIN
MYCIN's knowledge base
MYCIN's knowledge base is organized around a set of rules of the general form
if condition1 and ... and conditionm hold
then draw conclusion1 and... and conclusionn
encoded as data structures of the LISP programming language
Figure 8.2 shows the English translation of a typical MYCIN rule for inferring class of an organism. This translation is provided by the program itself. Such rules are called ORGRULES and they attempt to cover such organisms as streptococcus, pseudomonas, and entero-bacteria.
The rule says that if an isolated organism appears rod-shaped, stains in a certain way, and grows in the presence of oxygen, then it is more likely to be in the class entero-bacteria. The number 0.8 is called the tally of the rule, which says how certain conclusion is given, that the conditions are satisfied. The use of the tally is explained below. Each rule of this kind can be thought of as encoding a piece of human knowledge whose applicability depends only upon the context established by the conditions of the rule.
The conditions of a rule can also be satisfied with varying degrees of certainty, the import of such rules roughly is as follows:
if condition1 holds with certainty x1 ... and conditionm holds with certainty xm
then draw conclusion1 with certainty y1 and... and conclusionn with certainty yn
where the certainty associated with each conclusion is a function of the combined certainties of the conditions and the tally, which is meant to reflect our degree of confidence in the application of the rule.
In summary, a rule is a premise-action pair and such rules are sometimes called ‘productions' for purely historical reasons. Premises are conjunction of conditions, and their certainty is a function of the certainty of these conditions. Conditions are either proposition, which evaluate the truth or falsehood with some degree of certainty, (for example 'the organism is rod-shaped') or disjunctions of such conditions. Actions are either conclusions to be drawn with some appropriate degree of certainty, for example the identity of some organism, or instructions to be carried out, for example compiling a list of therapies.
We will explore the details of how rules are interpreted and scheduled for application in the following sections, but first we must look at MYCIN's other structures for representing medical knowledge.
IF 1) The stain of the organism is gramneg, and
2) The morphology of the organism is rod, and
3) The aerobicity of the organism is aerobic
THEN There is strongly suggestive evidence (.8) that
the class of the organism is entero-bacteria
A MYCIN ORGRULE for drawing the conclusion enterobacteriaaceae
In addition to rules, the knowledge base also stores facts and definitions in various forms:
simple lists, for example the list of all organisms known to the system;
knowledge tables, which contain records of certain clinical parameters and the values they take under various circumstances, for example the morphology (structural shape) of every bacterium known to the system;
a classification system for clinical parameters according to the context in which they apply, for example whether they are attributes of patients or organisms.
Much of the knowledge not contained in the rules resides in the properties associated with the 65 clinical parameters known to MYCIN. For example, shape is an attribute of organisms which can take on various values, such as 'rod' and 'coccus.' Parameters are also assigned properties by the system for its own purposes. The main ones either (i) help to monitor the interaction with the user, or (ii) provide indexes which guides the application of rules.
Patient information is stored in a structure called the context tree, which serves to organize case data. Figure on next page shows a context tree representing a particular patient, PATIENT-1, with three associated cultures (samples, such as blood samples, from which organisms may be isolated) and a recent operative procedure that may need to be taken into account (for example, because drugs were involved, or because the procedure involves particular risks of infection). Associated with cultures are organisms that are suggested by laboratory data, and associated with organisms are drugs that are effective against them.
Imagine that we have the following data stored in a record structure associated with the node for ORGANISM-1:
GRAM = (GRAMNEG 1.0)
MORPH = (ROD .8) (COCCUS .2)
AIR = (AEROBIC .6)
with the following meaning:
the Gram stain of ORGANISM-1 is definitely Gram negative;
ORGANISM-1 has a rod morphology with certainty 0.8 and a coccus morphology with certainty 0.2;
ORGANISM-1 is aerobic (grows in air) with certainty 0.6.
Figure 8.2: A typical MYCIN context tree
Suppose now that the rule of conclusion above is applied. We want to compute the certainty that all three conditions of the rule
IF 1) the stain of the organism is gramneg, and
2) the morphology of the organism is rod, and
3) the aerobicity of the organism is aerobic
THEN there is strongly suggestive evidence (0.8) that the class of the organism is entero-bacteria.
are satisfied by the data. The certainty of the individual conditions is 1.0, 0.8 and 0.6 respectively, and the certainty of their conjunction is taken to be the minimum of their individual certainties, hence 0.6.
The idea behind taking the minimum is that we are only confident in a conjunction of conditions to the extent that we are confident in its least inspiring element. This is rather like saying that a chain is only as strong as its weakest link. By an inverse argument, we argue that our confidence in a disjunction of conditions is as strong as the strongest alternative, that is, we take the maximum. This convention forms part of a style of inexact reasoning called fuzzy logic.
In the case, we draw the conclusion that the class of the organism is entero-bacteria with a degree of certainty equal to
0.6 x 0.8 = 0.48
The 0.6 represents our degree of certainty in the conjoined conditions, while the 0.8 stands for our degree of certainty in the rule application. These degrees of certainty are called certainty factors (CFs). Thus, in the general case,
CF(action) x CF(premise) x CF(rule).
Where we revisit the whole topic of how to represent uncertainty. It turns out that the CF model is not always in agreement with the theory of probability; in other words, it is not always correct from a mathematical point of view. However, the computation of certainty factors is much more tractable than the computation of the right probabilities, and the deviation does not appear to be very great in the MYCIN application.
MYCIN’s control structure
MYCIN has a top-level goal rule which define the whole task of the consultation system, which is paraphrased below:
IF 1) there is an organism which requires therapy and
2) consideration has been given to any other organisms requiring therapy
THEN compile a list of possible therapies, and determine the best one in this list.
A consultation session follows a simple two-step procedure:
• create the patient context as the top node in the context tree;
• attempt to apply the goal rule to this patient context.
Applying the rule involves evaluating its premise, which involves finding out if there is indeed an organism which requires therapy. In order to find this out, it must first find out if there is indeed an organism present which is associated with a significant disease. This information can either be obtained from the user directly, or via some chain of inference based on symptoms and laboratory data provided by the user.
The consultation is essentially a search through a tree of goals. The top goal at the root of the tree is the action part of the goal rule, that is, the recommendation of a drug therapy. Subgoals further down the tree include determining the organism involved and seeing if it is significant. Many of these subgoals have subgoals of their own, such as finding out the stain properties and morphology of an organism. The leaves of the tree are fact goals, such as laboratory data, which cannot be deduced.
A special kind of structure, called an AND/OR tree, is very useful for representing the way in which goals can be expanded into subgoals by a program. The basic idea is that root node of the tree represents the main goal, terminal nodes represent primitive actions that can be carried out, while non-terminal nodes represent subgoals that are susceptible to further analysis. There is a simple correspondence between this kind of analysis and the analysis of rule sets.
Consider the following set of condition-action rules:
if X has BADGE and X has GUN, then X is POLICE
if X has REVOI.VER or X as PISTOL or X has RIFLE, then X has GUN
if X has SHIELD, then X has BADGE
We can represent this rule set in terms of a tree of goals, so long as we maintain the distinction between conjunctions and disjunctions of subgoals. Thus, we draw an arc between the links connecting the nodes BADGE and GUN with the node POLICE, to signify that both subgoals BADGE and GUN must be satisfied in order to satisfy the goal POLICE. However, there is no arc between the links connecting REVOLVER and PISTOL and RIFLE with GUN, because satisfying either of these will satisfy GUN. Subgoals as BADGE can have a single child, SHIELD, signifying that a shield counts as a badge.
The AND/OR tree in Figure 8.3 can be thought of as a way of representing the search space for POLICE, by enumerating the ways in which different operators can be applied in order to establish POLICE as true.
Figure 8.3: Representing a rule set as an AND/OR tree
This kind of control structure is called backward chaining, since the program reasons backward from what it wants to prove towards the facts that it needs, rather than reasoning forward from the facts that it possesses. In MYCIN, goals were achieved by breaking them down into subgoals to which operators could be applied. Searching for a solution by backward reasoning is generally more focused than forward chaining, as we saw earlier, since one only considers potentially relevant facts.
MYCIN's control structure uses an AND/OR tree, and is quite simple as AI programs go;
(1) Each subgoal set up is always a generalized form of the original goal. So, if the subgoal is to prove the proposition that the identity of the organism is E. Coli, then the subgoal actually set up is to determine the identity of the organism. This initiates an exhaustive search on a given topic, which collects all of the available evidence about organisms.
(2) Every rule relevant to the goal is used, unless one of them succeeds with certainty. If more than one rule suggest a conclusion about a parameter, such as the nature of the organism, then their results are combined. If the evidence about a hypothesis falls between -0.2 and +0.2, it is regarded as inconclusive, and the answer is treated as unknown.
(3) If the current subgoal is a leaf node, then attempt to satisfy the goal by asking the user for data. Else set up the subgoal for further inference, and go to (1).
The selection of therapy takes place after this diagnostic process has run its course. It consists of two phases: selecting candidate drugs, and then choosing a preferred drug, or combination of drugs, from this list.
Evidence Combination
In MYCIN, two or more rules might draw conclusions about a parameter with different Weights of evidence. Thus one rule might conclude that the organism is E. Coli with a certainty of 0.8, while another might conclude from other data that it is E. Coli with a certainty of 0.5 or – 0.8. In the case of a certainty less than zero, the evidence is actually against the hypothesis.
Let X and Y be the weights derived from the application of different rules. MYCIN combines these weights using the following formula to yield the single certainty factor.
where |X| denotes the absolute value of X.
One can see what is happening on an intuitive basis. If the two pieces of evidence both confirm (or disconfirm) the hypothesis, then confidence in the hypothesis goes up (or down). If the two pieces of evidence are in conflict, then the denominator dampens the effect.
This formula can be applied more than once, if several rules draw conclusions about the same parameter. It is commutative, so it does not matter in what order weights are combined.
IF the identity of the organism is pseudomonas
THEN I recommend therapy from among the following drugs:
1 CCLISTIN (.98)
2 POLYMYXIN (.96)
3 QENTAMICIN (.96)
4 CARBENICILLIN (.65)
5 SULFISOXAZOLE (.64)
A MYCIN therapy rule
The special goal rule at the top of the AND/OR tree does not lead to a conclusion, but instigates actions, assuming that the conditions in the premise are satisfied. At this point, MYCIN's therapy rules for selecting drug treatments come into play; they contain sensitivities information for the various organisms known to the system. A sample therapy rule is given above.
The numbers associated with the drug are the probabilities that a pseudomonas will be sensitive to the indicated drug according to medical statistics. The preferred drug is selected from the list according to criteria, which attempts to screen for contra-indications of the drug and minimize the number of drugs administered, in addition to maximizing sensitivity. The user can go on asking for alternative therapies until MYCIN runs out of options, so the pronouncements of the program are not definitive.
Applications of Expert System
Since the introduction of these early expert systems, the range and depth of applications has broadened dramatically. Applications can now be found in almost all areas of business and government. They include such areas as
Different types of medical diagnoses (internal medicine, pulmonary diseases, infectious, blood diseases, and so on)
Diagnosis of complex electronic and electromechanical system
Diagnosis of diesel electric locomotion systems
Diagnosis of software development projects.
Planning experiments in biology, chemistry, and molecular genetics
Forecasting crop damage
Identification of chemical compound structures and chemical compounds.
Location of faults in computer and communications systems
Scheduling of customer orders, job shop production operations, computer resources for operating system, and various manufacturing tasks.
Evaluation of loan applicants for lending institutions
Assessment of geologic structures from dip meter logs.
Analysis of structural systems for design or as a result of earthquake damage
The optimal configuration of components to meet given specifications for a complex system (like computers or manufacturing facilities)
Estate planning for minimal taxation and other specified goals.
Stock and bond portfolio selection and management
The design of very large scale integration (VLSI) systems
Numerous military applications ranging battlefield assessment to ocean surveillance.
Numerous applications related to space planning and exploration
Numerous areas of law including civil case evaluation, product liability, assault and battery, and general assistance in locating different law precedents.
Planning curricula for students.
Teaching students specialized tasks (like trouble-shooting equipment faults)
Importance of Expert Systems
The value of expert systems was well established by the early 1980s. A number of successful applications had been completed by then and they proved to be cost effective. An example, which illustrates this point well is the diagnostic system developed by the Campbell Soup Company.
Campbell Soup use large sterilizers or cookers to cook soups and other canned products at eight plants located throughout the country. Some of the larger cookers hold up to 68,000 cane of food for short periods of cooking time. When difficult maintenance problems occur with the cookers, the fault must be found and corrected quickly or the batch of foods being prepared will spoil. Until recently, the company had been depending on a single expert to diagnose and cure the more difficult problems, flying him to the site when necessary. Since this individual will retire in a few years taking his expertise with him, the company decided to develop an expert system to diagnose these difficult problems.
After some months of development with assistance from Texas Instruments, the company developed an expert system, which ran on a PC. The system has about 150 rules in its knowledge base to diagnose the more complex cooker problems. The system has also been used to provide training to new maintenance personnel. Cloning multiple copies for each of the eight locations cost the company only a few pennies per copy. Furthermore, the system cannot retire, and its performance can continue to be improved with the addition of more rules. It has already proven to be a real asset to the company. Similar cases now abound in many diverse organizations.
Top
Representing and Using Domain Knowledge
Expert systems are complex AI programs. However, the most widely used way of representing domain knowledge in expert systems is, as a set of production rules, which are often coupled with a frame system that defines the objects that occur in the rules. Let's look at a few additional examples drawn from some other representative expert systems. All the rules we show are English versions of the actual rules that the systems use. Differences among these rules illustrate some of the important differences in the ways that expert systems operate.
Top
RI
RI (sometimes also called XCON) is a program that configures DEC VAX systems. Its rules look like this:
If: The most current active context is distributing massbus devices, and
There is a single-port disk drive that has not been' assigned to a massbus, and
The number of devices that each massbus should support is known, and
There is a massbus that has been assigned at least
One disk drive and that should support additional disk drives and
The type of cable needed to connect the disk drive to the previous device on the massbus is known
then
Assign the disk drive to the massbus.
Notice that Rl's rules, unlike MYCIN's, contain no numeric measures of certainty. In the task domain with which RI deals, it is possible to state exactly the correct thing to be done in each particular set of circumstances (although it may require a relatively complex set of antecedents to do so). One reason for this is that there exists a good deal of human expertise in this area. Another is that since RI is doing a design task (in contrast to the diagnosis task performed by MYCIN), it is not necessary to consider all possible alternatives; one good one is enough. As a result, probabilistic information is not necessary in RI.
PROSPECTOR is a program that provides advice on mineral exploration. Its rules look like this:
If: Magnetite or pyrite in disseminated or vein let form is present
then (2, -4) there is favourable mineralization and texture for the propylitic stage.
In PROSPECTOR, each rule contains two confidence estimates. The first indicates the extent to which the presence of the evidence described in the condition part of the rule suggests the validity of the rule's conclusion. In the PROSPECTOR rule shown above, the number 2 indicates that the presence of the evidence is mildly encouraging. The second-confidence estimate measures the extent to which the evidence is necessary to the validity of the conclusion, or stated another way, the extent to which the lack of the evidence indicates that the conclusion is not valid. In the example rule shown above, the number -4 indicates that the absence of the evidence is strongly discouraging for the conclusion.
DESIGN ADVISOR is a system that critiques chip designs. Its rules look like:
If The sequential 'level count of ELEMENT is greater than 2, UNLESS the signal of ELEMENT is resetable
then Critique for poor resetability
DEFEAT Poor resetability of ELEMENT
due to Sequential level count of ELEMENT greater than 2
by ELEMENT is directly resetable
The DESIGN ADVISOR gives advice to a chip designer, who can accept or reject the advice. If the advice is rejected, then system can exploit a justification-based truth maintenance system to revise its model of the circuit. The first rule shown here says that an element should be criticized for poor resetability if its sequential level count is greater than two, unless its signal is currently believed to be resetable. Resetability is a fairly common condition, so it is mentioned explicitly in this first rule. But there is also a much less common condition, called direct resetability. The DESIGN ADVISOR does not even bother to consider that condition unless it gets in trouble with its advice. At that point, it can exploit the second of the rules shown above. Specifically, if the chip designer rejects a critique about resetability and if that critique was based on a high level count, then the system will attempt to discover (possibly by asking the designer) whether the element is directly resetable. If it is, then the original rule is defeated and the conclusion withdrawn.
Reasoning with the Knowledge
As these example rules have shown, expert systems exploit many of the representation and reasoning mechanisms that we have discussed. Because these programs are usually, written primarily as rule-based systems, forward chaining, backward chaining, or some combination of the two, is usually used. For example, MYCIN used backward chaining to discover what organisms were present; then it used forward chaining to reason from the organisms to a treatment regime. RI, on the other hand, used forward chaining. As the field of expert systems matures, more systems that exploit other kinds of reasoning mechanisms are being developed. The DESIGN ADVISOR is an example of such a system; in addition to exploiting rules, it makes extensive use of a justification-based truth maintenance system.
Expert System Shells
Initially, each expert system that was built was created from scratch, usually in LISP. But, after several systems had been built this way, it became clear that these systems often had a lot in common. In particular, since the systems were constructed as a set of declarative representations (mostly rules) combined with an interpreter for those representations, it was possible to separate the interpreter from the domain-specific knowledge and thus to create a system that could be used to construct new expert systems by adding new knowledge corresponding to the new problem domain. The resulting interpreters are called shells. One influential example of such a shell is EMYCIN (for Empty MYCIN), which was derived from MYCIN.
There are now several commercially available shells that serve as the basis for many of the expert systems currently being built. These shells provide much greater flexibility in representing knowledge and in reasoning with it than MYCIN did. They typically support rules, frames, truth maintenance systems, and a variety of other reasoning mechanisms.
Early expert system shells provided mechanisms for knowledge representation, reasoning, and explanation. Later, tools for knowledge acquisition were added. Expert system shells needed to do something else as well. They needed to make it easy to integrate expert systems with other kinds of programs. Expert systems cannot operate in a vacuum, any more than their human counterparts can. They need access to corporate databases, and access to them needs to be controlled just as it does for other systems. They are often embedded within larger application programs that use primarily conventional programming techniques. So one of the important features that a shell must provide is an easy-to-use interface between an expert system that is written with the shell and a larger, probably more conventional, programming environment.
Student Activity 8.1
Before reading next section, answer the following questions.
What problems does an expert system solve?
Discuss the characteristics and applications of Expert Systems.
Distinguish between RI and MYCIN.
If your answers are correct, then proceed to the next section.
Top
Explanation
In order for an expert system to be an effective tool, people must be able to interact with it easily. To facilitate this interaction, the expert system must have the following two capabilities in addition to the ability to perform its underlying task:
1. Explain its reasoning. In many of the domains in which expert systems operate, people will not accept results unless they have been convinced of the accuracy of the reasoning process that produced those results. This is particularly true, for example, in medicine, where a doctor must accept ultimate responsibility for a diagnosis, even if that diagnosis was arrived at with considerable help from a program. Thus it is important that the reasoning process used in such programs proceed in understandable steps and that enough meta-knowledge (knowledge about the reasoning process) be available so the explanations of those steps can be generated.
2. Acquire new knowledge and modifications of old knowledge. Since expert systems derive their power from the richness of the knowledge bases they exploit, it is extremely important that those knowledge bases be as complete and as accurate as possible. But often there exists no standard codification of that knowledge; rather it exists only inside the heads of human experts. One way to get this knowledge into a program is through interaction with the human expert. Another way is to have the program learn expert behavior from raw data.
TEIRESIAS was the first program to support explanation and knowledge acquisition. TEIRESIAS served as a front-end for the MYCIN expert system. A fragment of a TEIRESIAS-MYCIN conversation with a user (a doctor) is shown in figure 8.4. The program has asked for a piece of information that it needs in order to continue its reasoning. The doctor wants to know why the program wants the information, and later asks how the program arrived at a conclusion that it claimed it had reached.
Is the patient’s illness with Organism-1 a hospital-acquired infection?
**Why
[i.e., Why it is important to determine whether or not the infection with Organism-1 was acquired while the patient was hospitalized?]
[2.0] This will aid in determining the category of
Organism-1. It has already been established that
[2.1] the gram strain of Organism-1 is gramneg, and
[2.2] the morphology of Organism-1 is rod, and
[2.3] the aerobicity of Organism-1 is facultative Therefore if
[2.4] the infection with Organism-1 was not acquired while the patient was hospitalized, then
There is strongly suggestive evidence (.8) that the category of Organism-1 is entero-bacteria.
[Also: there is weakly suggestive evidence (.1) that the identity of Organism-1 is pseudomonas-aeruginosa] [Rule037]
** How [2.3]
[i.e., How was it established that the aerobicity of Organism-1 is facultative?]
The following were used:
[3.1] Rule027 indicated there is strongly suggestive evidence (.8) that the aerobicity of Organism-1 is facultative.
[3.2] Rule033 indicated there is weakly suggestive evidence (.2) that the aerobicity of Organism-1 is not facultative.
Since this gave a cummulative CF of (.6) for facultative, the evidence indicates that the aerobicity of Organism-1 is facultative.
Figure 8.4: A Portion of a Dialogue with TEIRESIAS
An important premise underlying TEIRESIAS's approach to explanation is that the behavior of a program can be explained simply by referring to a trace of the program's execution. There are ways in which this assumption limits the kinds of explanations that can be produced, but it does minimize the overhead involved in generating each explanation. To understand how TEIRESIAS generates explanations of MYCIN's behavior, we need to know how that behavior is structured.
MYCIN attempts to solve its goal of recommending a therapy for a particular patient by first finding the cause of the patient's illness. It uses its production rules to reason backward from goals to clinical observations. To solve the top-level diagnostic goal, it looks for rules whose right sides suggest diseases. It then uses the left sides of those rules (the preconditions) to set up subgoals whose success would enable the rules to be invoked. These subgoals are again matched against rules, and their preconditions are used to set up additional subgoals. Whenever a precondition describes a specific piece of clinical evidence, MYCIN uses that evidence if it already has access to it. Otherwise, it asks the user to provide the information. In order that MYCIN's requests for information will appear coherent to the user, the actual goals that MYCIN sets up are often more general than they need be to satisfy the preconditions of an individual rule. For example, if a precondition specifies that the identity of an organism is X, MYCIN will set up the goal "infer identity." This approach also means that if another rule mentions the organism-1's identity, no further work will be required, since the identity will be known.
We can now return to the trace of TEIRESIAS-MYCIN's behavior shown in Figure above. The first question that the user asks is a "WHY" question, which is assumed to mean, "Why do you need to know that?" Particularly for clinical tests that are either expensive or dangerous, it is important for the doctor to be convinced that the information is really needed before ordering the test. (Requests for sensitive or confidential information present similar difficulties.) Because MYCIN is reasoning backward, the question can easily be answered by examining the goal tree. Doing so provides two kinds of information:
1. What higher-level question might the system be able to answer if it had the requested piece of information? (In this case, it could help determine the category of ORGANISM-1.)
2. What other information does the system already have that makes it think that the requested piece of knowledge would help? (In this case, facts [2.1] to [2.4].)
When TEIRESIAS provides the answer to the first of these questions, the user may be satisfied or may want to follow the reasoning process back even further. The user can do that by asking additional "WHY" questions.
When TEIRESIAS provides the answer to the second of these questions and tells the user what it already believes, the user may want to know the basis for those beliefs. The user can ask this with a "HOW" question, which TEIRESIAS will interpret as "How did you know that?" This question can also be answered by looking at the goal tree and chaining backward from the stated fact to the evidence that allowed a rule that determined the fact to fire. Thus we see that by reasoning backward from its top-level goal and by keeping track of the entire tree that it traverses in the process, TEIRESIAS- MYCIN can do a fairly good job of justifying its reasoning to a human user.
The production system model is very general, and without some restrictions, it is hard to support all the kinds of explanations that a human might want. If we focus on a particular type of problem solving, we can ask more probing questions. For example, SALT is a knowledge acquisition program used to build expert systems that design artifacts through a propose-and-revise strategy. SALT is capable of answering questions like WHY-NOT ("why didn't you assign value x to this parameter?") and WHAT-IF ("what would happen if you did?"). A human might ask" these questions in order to locate incorrect or missing knowledge in the system as a precursor to correcting it. We now turn to ways in which a program such as SALT can support the process of building and refining knowledge.
Student Activity 8.2
Before reading next section, answer the following questions.
What is the role of Expert System shells?
What are the chance TERISTIES of a knowledge acquisition system?
Contrast expert system and neural networks in terms of knowledge representation and knowledge acquisition. Give one domain in which the expert system approach would be more promising and one domain in which the neural network approach is more promising.
If your answers are correct, then proceed to the next section.
Knowledge Acquisition
How are expert systems built? Typically, a knowledge engineer interviews a domain expert to elucidate expert knowledge, which is then translated into rules. After the initial system is built, it must be iteratively refined until it approximates expert-level performance. This process is expensive and time-consuming, so it is worthwhile to look for more automatic ways of constructing expert knowledge bases. While no totally automatic knowledge acquisition systems yet exist, there are many programs that interact with domain experts to extract expert knowledge efficiently. These programs provide support for the following activities:
1. Entering knowledge
2. Maintaining knowledge base consistency
3. Ensuring knowledge base completeness
The most useful knowledge acquisition programs are those that are restricted to a particular problem-solving paradigm, e.g., diagnosis or design. It is important to be able to enumerate the roles that knowledge can play in the problem-solving process. For example, if the paradigm is diagnosis, then the program can structure its knowledge base around symptoms, hypotheses, and causes. It can identify symptoms for which the expert has not yet provided causes. Since one symptom may have multiple causes, the program can ask for knowledge about how to decide when one hypothesis is better than another. If we move to another type of problem solving, say designing artifacts, then these acquisition strategies no longer apply, and we must look for other ways of profitably interacting with an expert. We now examine two knowledge acquisition systems in detail.
MOLE is a knowledge acquisition system for heuristic classification problems, such as diagnosing diseases. In particular, it is used in conjunction with the cover-and-differentiate problem-solving method. An expert system produced by MOLE accepts input data, comes up with a set of candidate explanations or classifications that cover (or explain) the data, then uses differentiating knowledge to determine which one is best. The process is iterative, since explanations must themselves be justified, until ultimate causes are ascertained.
MOLE interacts with a domain expert to produce a knowledge base that a system called MOLE-p (for MOLE-performance) uses to solve problems. The acquisition proceeds through several steps:
1. Initial knowledge base construction. MOLE asks the expert to list common symptoms or complaints that might require diagnosis. For each symptom, MOLE prompts for a list of possible explanations. MOLE then iteratively seeks out higher-level explanations until it comes up with a set of ultimate causes. Whenever an event has multiple explanations, MOLE tries to determine the conditions under which one explanation is correct. The expert provides covering knowledge, that is, the knowledge that a hypothesized event might be the cause of a certain symptom. MOLE then tries to infer anticipatory knowledge, which says that if the hypothesized event does occur, then the symptom will definitely appear. This knowledge allows the system to rule out certain hypotheses on the basis that specific symptoms are absent.
2. Refinement of the knowledge base. MOLE now tries to identify the weaknesses of the knowledge base. One approach is to find holes and prompt the expert to fill them. It is difficult in general, to know whether a knowledge base is complete, so instead MOLE lets the expert watch MOLE-p solving sample problems. Whenever MOLE-p makes an incorrect diagnosis, the expert adds new knowledge. There are several ways in which MOLE-p can reach the wrong conclusion. It may incorrectly reject a hypothesis because it does not feel that the hypothesis is needed to explain any symptom. It may advance a hypothesis because it is needed to explain some otherwise inexplicable hypothesis. Or it may lack differentiating knowledge for choosing between alternative hypotheses.
For example, suppose we have a patient with symptoms A and B. Further suppose that symptom A could be caused by events X and ¥, and that symptom B can be caused by Y and Z. MOLE-p might conclude Y, since it explains both A and B. If the expert indicates that this decision was incorrect, then MOLE will ask what evidence should be used to prefer X and/or Z over Y.
MOLE has been used to build systems that diagnose problems with car engines, problems in steel-rolling mills, and inefficiencies in coal-burning power plants. For MOLE to be applicable, however, it must be possible to preenumerate solutions or classifications. It must also be practical to encode the knowledge in terms of covering and differentiating.
But suppose our task is to design an artifact, for example, an elevator system. It is no longer possible to pre-enumerate all solutions. Instead, we must assign values to a large number of parameters, such as the width of the platform, the type of door, the cable weight, and the Cable strength. These parameters must be consistent with each other, and they must result in a design that satisfies external constraints imposed by cost factors, the type of building involved, and expected payloads.
One problem-solving method useful for design tasks is called propose-and-revise. Propose-and-revise systems build up solutions incrementally. First, the system proposes an extension to the current design. Then it checks whether the extension violates any global or local constraints. Constraint violations are then fixed, and the process repeats. It turns out that domain experts are good at listing overall design constraints and at providing local constraints on individual parameters, but not so good at explaining how to arrive at global solutions. The SALT program provides mechanisms for elucidating this knowledge from the expert.
Like MOLE, SALT builds a dependency network as it converses with the expert. Each node stands for a value of a parameter that must be acquired or generated. There are three kinds of links: contributes-to, constrains, and suggests-revision-of. Associated with the first type of link are procedures that allow SALT to generate a value for one parameter based on the value of another. The second type of link, constrains, rules out certain parameter values. The third link, suggests-revision-of, points to ways in which a constraint violation can be fixed. SALT uses the following heuristics to guide the acquisition process:
1. Every noninput node in the network needs at least one contributes-to link coming into it. If links are missing, the expert is prompted to fill them in.
2. No contributes-to loops are allowed in the network. Without a value for at least one parameter in the loop, it is impossible to compute values for any parameter in that loop. If a loop exists, SALT tries to transform one of the contributes-to links into a constraint link.
3. Constraining links should have suggests-revision-of links associated with them. These include constrains links that are created when dependency loops are broken.
Control knowledge is also important. It is critical that the system propose extensions and revisions that lead toward a design solution. SALT allows the expert to rate revisions in terms of how much trouble they tend to produce.
SALT compiles its dependency network into a set of production rules. As with MOLE, an expert can watch the production system, solve problems and can override the system's decision. At that point, the knowledge base can be changed or the override can be logged for future inspection.
The process of interviewing a human expert to extract expertise presents a number of difficulties, regardless of whether the interview is conducted by a human or by a machine. Experts are surprisingly inarticulate when it comes to how they solve problems. They do not seem to have access to the low-level details of what they do and are especially inadequate suppliers of any type of statistical information. There is, therefore, a great deal of interest in building systems that automatically induce their own rules by looking at sample problems and solutions. With inductive techniques, an expert needs only to provide the conceptual framework for a problem and a set of useful examples.
For example, consider a bank's problem in deciding whether to approve a loan. One approach to automating this task is to interview loan officers in an attempt to extract their domain knowledge. Another approach is to inspect the record of loans the bank has made in the past and then try to generate automatically rules that will maximize the number of good loans and minimize the number of bad ones in the future.
META-DENDRAL was the first program to use learning techniques to construct rules for an expert system automatically. It built rules to be used by DENDRAL, whose job was to determine the structure of complex chemical compounds. META-DENDRAL was able to induce its rules based on a set of mass spectrometry data; it was then able to identify molecular structures with very high accuracy. META-DENDRAL used the version space learning algorithm. Another popular method for automatically constructing expert systems is the induction of decision trees. Decision tree expert systems have been built for assessing consumer credit applications, analyzing hypothyroid conditions, and diagnosing soybean diseases, among many other applications.
Statistical techniques, such as multivariate analysis, provide an alternative approach to building expert-level systems. Unfortunately, statistical methods do not produce concise rules that humans can understand. Therefore it is difficult for them to explain their decisions.
For highly structured problems that require deep causal chains of reasoning, learning techniques are presently inadequate. There is, however, a great deal of research activity in this area.
Summary
l Expert systems use symbolic representations for knowledge (rules, networks, or frames) and perform their inference through symbolic computations that closely resemble manipulations of natural language. An expert system is usually built with the aid of one or more experts, who must be willing to spend a great deal of effort transferring their expertise to the system.
l Expert systems are complex AI programs. However, the most widely used by way of representing domain knowledge in expert systems is, as a set of production rules, which are often coupled with a frame system that defines the objects that occur in the rules
l The most useful knowledge acquisition programs are those that are restricted to a particular problem-solving paradigm, e.g., diagnosis or design
l Transfer of knowledge takes place gradually through many interactions between the expert and the system, The expert will never get the knowledge right or complete the first time.
l The amount of knowledge that is required depends on the task. It may range from forty rules to thousands.
l The choice of control structure for a particular system depends on specific characteristics of the system.
l It is possible to extract the nondomain-specific parts from existing expert systems and use them as tools for building new systems in new domains.
l MYCIN is an expert system, which diagnoses infectious blood diseases and determines a recommended list of therapies for the patient.
l RI (sometimes also called XCON) is a program that configures DEC VAX systems
Learning Objectives
After reading this unit you should appreciate the following:
Need and Justification of Expert Systems
Knowledge Acquisition
Case Studies
MYCIN
RI
Top
Need and Justification of Expert Systems
This unit describes the basic architecture of knowledge-based systems with emphasis placed on expert systems. Expert systems are recent product of artificial intelligence. They began to emerge as university research systems during the early 1970s. They have now become one of the most important innovations of AI, since they have been shown to be successful commercial products as well as interesting research tools.
Expert systems have been proven to be effective in a number of problem domains, which normally require the kind of intelligence possessed by a human expert. The areas of application are almost endless. Wherever human expertise is needed to solve a problem, expert systems are most likely of the options sought. Application domain includes law, chemistry, biology engineering, manufacturing, aerospace military operations, finance, banking, meteorology, geology, geophysics and more .The list goes on and on.
In this chapter we explore expert system architectures and related building tools. We also look at a few of the more important application areas as well. The material is intended to acquaint the reader with the basic concepts underlying expert system and to provide enough of the fundamentals needed to build basic systems to pursue further studies and conduct research in the area.
An expert system is a set of programs that manipulate encoded knowledge to solve problems in a specialized domain that normally requires human expertise. An expert system’s knowledge is obtained from expert sources and coded in a form suitable for the system to use in inference or reasoning processes. The expert knowledge must be obtained from specialists or other sources of expertise, such as texts, journal articles, and databases. This type of knowledge usually requires much of training and experience in specialized fields such as medicine, geology, system configuration, or engineering design. Once a sufficient body of expert knowledge has been acquired, it must be encoded in some form, into a knowledge base, then tested, and refined continually throughout the life of the system.
Characteristic Features of Expert Systems
Expert systems differ from conventional computer systems in several important ways.
1. Expert systems use knowledge rather than data to control the solution process. “In the knowledge lies the power” is a theme repeatedly followed and supported through this book. Much of the knowledge used is heuristic in nature rather than algorithmic.
2. The knowledge is encoded and maintained as an entity separate from the control program. As such, it is not compiled together with the control program itself. This permits the incremental addition and modification (refinement) of the knowledge base without recompilation of the control programs. Furthermore, it is possible in some cases to use different knowledge bases with the same control programs to produce different types of expert systems. Such systems are known as Expert System Shell, as they may be loaded with different knowledge bases.
3. Expert systems are capable to explain how a particular conclusion was achieved, and why requested information is needed during a consultation. This is important as it gives the user a chance to assess and understand the system’s reasoning ability, thereby improving the user’s confidence in the system.
4. Expert systems use symbolic representations for knowledge (rules, networks, or frames) and perform their inference through symbolic computations that closely resemble manipulations of natural language. (An exception to this is the expert system based on neutral network architectures.)
5. Expert systems often reason with metaknowledge (knowledge about knowledge) also, their own knowledge limits it’s capabilities.
Top
MYCIN
The development of MYCIN began at Stanford University. MYCIN is an expert system, which diagnoses infectious blood diseases and determines a recommended list of therapies for the patient. As part of the Heuristic Programming Project at Stanford, several projects directly related to MYCIN were also completed including a knowledge acquisition component called THEIRESIUS, a tutorial component called GUIDON, and a shell component called EMYCIN (for Essential MYCIN). EMYCIN was used to build other diagnostic systems including PUFF, a diagnostic expert for pulmonary diseases. EMYCIN also became the design model for several commercial expert system building tools.
MYCIN’s performance improved significantly over a period of several year as additional knowledge was added. Tests indicate that MYCIN’ performance now equals or exceeds that of experienced physicians. The initial MYCIN knowledge base contained about only 200 rules. This number was gradually increased to more than 600 rules by the early 1980s. The added rules significantly improved MYCIN’s performance leading to a 65% success record that compared favorably with experienced physicians who demonstrated only an average 60% success rate.
Subgoaling in MYCIN
MYCIN is a heterogeneous program, consisting of many different modules. There is a part of MYCIN's control structure that performs a quasi-diagnostic function. But the goals to be achieved are not physical goals, involving the movement of objects in space, but reasoning goals that involve the establishment of diagnostic hypothesis.
This section concentrates upon the diagnostic module of MYCIN, giving a simplified account of its function, structure and runtime behavior.
Treating blood infections
Firstly, we need to give a brief description of MYCIN's domain: treatment of blood infections. This description pre-supposes no specialized medical knowledge on the part of the reader. But, as with any expert system, having some understanding of the domain is crucial to understand what the program does.
An 'anti-microbial agent' is any drug designed to kill bacteria or arrest their growth. Some agents are too toxic for therapeutic purposes, and there is no single agent effective against all bacteria. The selection of therapy for bacterial infection can be viewed as a four-part decision process:
Deciding if the patient has a significant infection;
Determining the (possible) organism(s) involved;
Selecting a set of drugs that might be appropriate;
Choosing the most appropriate drug or combination of drugs.
Samples taken from the site of infection are sent to a microbiology laboratory for culture, that is, an attempt to grow organisms from the sample in a suitable medium.
Early evidence of growth may allow a report of the morphological or staining characteristics of the organism. However, even if an organism is identified, the range of drugs it is sensitive to, may be unknown or uncertain.
MYCIN is often described as a diagnostic program, but this is not so. Its purpose is to assist a physician who is not an expert in the field of antibiotics with the treatment of blood infections. In doing so, it develops diagnostic hypotheses and weights them, but it need not necessarily choose between them. Work on MYCIN began in 1972 as collaboration between the medical and AI communities at Stanford University. The most complete single account of this work is Short-life (1976).
There have been a number of extensions, revisions and abstractions of MYCIN since 1976, but the basic version has five components shown in the fig. 8.1 which shows the basic pattern of information flow between the modules.
(1} A-knowledgebase which contains factual and judgmental knowledge about the domain.
(2) A dynamic patient database containing information about a particular case.
(3) A consultation program, which asks questions, draws conclusions, and gives advice about a particular case based on the patient data and the static knowledge.
(4) An explanation program, which answers questions and justifies its advice, using static knowledge and a trace of the program’s execution.
(5) A knowledge acquisition program for adding new rules and changing existing ones.
The system consisting of components (l)-(3) is the problem solving pan of MYCIN, which generates hypotheses with respect to the offending organisms, and makes therapy recommendations based on these hypotheses.
Figure 8.1: Organization of MYCIN
MYCIN's knowledge base
MYCIN's knowledge base is organized around a set of rules of the general form
if condition1 and ... and conditionm hold
then draw conclusion1 and... and conclusionn
encoded as data structures of the LISP programming language
Figure 8.2 shows the English translation of a typical MYCIN rule for inferring class of an organism. This translation is provided by the program itself. Such rules are called ORGRULES and they attempt to cover such organisms as streptococcus, pseudomonas, and entero-bacteria.
The rule says that if an isolated organism appears rod-shaped, stains in a certain way, and grows in the presence of oxygen, then it is more likely to be in the class entero-bacteria. The number 0.8 is called the tally of the rule, which says how certain conclusion is given, that the conditions are satisfied. The use of the tally is explained below. Each rule of this kind can be thought of as encoding a piece of human knowledge whose applicability depends only upon the context established by the conditions of the rule.
The conditions of a rule can also be satisfied with varying degrees of certainty, the import of such rules roughly is as follows:
if condition1 holds with certainty x1 ... and conditionm holds with certainty xm
then draw conclusion1 with certainty y1 and... and conclusionn with certainty yn
where the certainty associated with each conclusion is a function of the combined certainties of the conditions and the tally, which is meant to reflect our degree of confidence in the application of the rule.
In summary, a rule is a premise-action pair and such rules are sometimes called ‘productions' for purely historical reasons. Premises are conjunction of conditions, and their certainty is a function of the certainty of these conditions. Conditions are either proposition, which evaluate the truth or falsehood with some degree of certainty, (for example 'the organism is rod-shaped') or disjunctions of such conditions. Actions are either conclusions to be drawn with some appropriate degree of certainty, for example the identity of some organism, or instructions to be carried out, for example compiling a list of therapies.
We will explore the details of how rules are interpreted and scheduled for application in the following sections, but first we must look at MYCIN's other structures for representing medical knowledge.
IF 1) The stain of the organism is gramneg, and
2) The morphology of the organism is rod, and
3) The aerobicity of the organism is aerobic
THEN There is strongly suggestive evidence (.8) that
the class of the organism is entero-bacteria
A MYCIN ORGRULE for drawing the conclusion enterobacteriaaceae
In addition to rules, the knowledge base also stores facts and definitions in various forms:
simple lists, for example the list of all organisms known to the system;
knowledge tables, which contain records of certain clinical parameters and the values they take under various circumstances, for example the morphology (structural shape) of every bacterium known to the system;
a classification system for clinical parameters according to the context in which they apply, for example whether they are attributes of patients or organisms.
Much of the knowledge not contained in the rules resides in the properties associated with the 65 clinical parameters known to MYCIN. For example, shape is an attribute of organisms which can take on various values, such as 'rod' and 'coccus.' Parameters are also assigned properties by the system for its own purposes. The main ones either (i) help to monitor the interaction with the user, or (ii) provide indexes which guides the application of rules.
Patient information is stored in a structure called the context tree, which serves to organize case data. Figure on next page shows a context tree representing a particular patient, PATIENT-1, with three associated cultures (samples, such as blood samples, from which organisms may be isolated) and a recent operative procedure that may need to be taken into account (for example, because drugs were involved, or because the procedure involves particular risks of infection). Associated with cultures are organisms that are suggested by laboratory data, and associated with organisms are drugs that are effective against them.
Imagine that we have the following data stored in a record structure associated with the node for ORGANISM-1:
GRAM = (GRAMNEG 1.0)
MORPH = (ROD .8) (COCCUS .2)
AIR = (AEROBIC .6)
with the following meaning:
the Gram stain of ORGANISM-1 is definitely Gram negative;
ORGANISM-1 has a rod morphology with certainty 0.8 and a coccus morphology with certainty 0.2;
ORGANISM-1 is aerobic (grows in air) with certainty 0.6.
Figure 8.2: A typical MYCIN context tree
Suppose now that the rule of conclusion above is applied. We want to compute the certainty that all three conditions of the rule
IF 1) the stain of the organism is gramneg, and
2) the morphology of the organism is rod, and
3) the aerobicity of the organism is aerobic
THEN there is strongly suggestive evidence (0.8) that the class of the organism is entero-bacteria.
are satisfied by the data. The certainty of the individual conditions is 1.0, 0.8 and 0.6 respectively, and the certainty of their conjunction is taken to be the minimum of their individual certainties, hence 0.6.
The idea behind taking the minimum is that we are only confident in a conjunction of conditions to the extent that we are confident in its least inspiring element. This is rather like saying that a chain is only as strong as its weakest link. By an inverse argument, we argue that our confidence in a disjunction of conditions is as strong as the strongest alternative, that is, we take the maximum. This convention forms part of a style of inexact reasoning called fuzzy logic.
In the case, we draw the conclusion that the class of the organism is entero-bacteria with a degree of certainty equal to
0.6 x 0.8 = 0.48
The 0.6 represents our degree of certainty in the conjoined conditions, while the 0.8 stands for our degree of certainty in the rule application. These degrees of certainty are called certainty factors (CFs). Thus, in the general case,
CF(action) x CF(premise) x CF(rule).
Where we revisit the whole topic of how to represent uncertainty. It turns out that the CF model is not always in agreement with the theory of probability; in other words, it is not always correct from a mathematical point of view. However, the computation of certainty factors is much more tractable than the computation of the right probabilities, and the deviation does not appear to be very great in the MYCIN application.
MYCIN’s control structure
MYCIN has a top-level goal rule which define the whole task of the consultation system, which is paraphrased below:
IF 1) there is an organism which requires therapy and
2) consideration has been given to any other organisms requiring therapy
THEN compile a list of possible therapies, and determine the best one in this list.
A consultation session follows a simple two-step procedure:
• create the patient context as the top node in the context tree;
• attempt to apply the goal rule to this patient context.
Applying the rule involves evaluating its premise, which involves finding out if there is indeed an organism which requires therapy. In order to find this out, it must first find out if there is indeed an organism present which is associated with a significant disease. This information can either be obtained from the user directly, or via some chain of inference based on symptoms and laboratory data provided by the user.
The consultation is essentially a search through a tree of goals. The top goal at the root of the tree is the action part of the goal rule, that is, the recommendation of a drug therapy. Subgoals further down the tree include determining the organism involved and seeing if it is significant. Many of these subgoals have subgoals of their own, such as finding out the stain properties and morphology of an organism. The leaves of the tree are fact goals, such as laboratory data, which cannot be deduced.
A special kind of structure, called an AND/OR tree, is very useful for representing the way in which goals can be expanded into subgoals by a program. The basic idea is that root node of the tree represents the main goal, terminal nodes represent primitive actions that can be carried out, while non-terminal nodes represent subgoals that are susceptible to further analysis. There is a simple correspondence between this kind of analysis and the analysis of rule sets.
Consider the following set of condition-action rules:
if X has BADGE and X has GUN, then X is POLICE
if X has REVOI.VER or X as PISTOL or X has RIFLE, then X has GUN
if X has SHIELD, then X has BADGE
We can represent this rule set in terms of a tree of goals, so long as we maintain the distinction between conjunctions and disjunctions of subgoals. Thus, we draw an arc between the links connecting the nodes BADGE and GUN with the node POLICE, to signify that both subgoals BADGE and GUN must be satisfied in order to satisfy the goal POLICE. However, there is no arc between the links connecting REVOLVER and PISTOL and RIFLE with GUN, because satisfying either of these will satisfy GUN. Subgoals as BADGE can have a single child, SHIELD, signifying that a shield counts as a badge.
The AND/OR tree in Figure 8.3 can be thought of as a way of representing the search space for POLICE, by enumerating the ways in which different operators can be applied in order to establish POLICE as true.
Figure 8.3: Representing a rule set as an AND/OR tree
This kind of control structure is called backward chaining, since the program reasons backward from what it wants to prove towards the facts that it needs, rather than reasoning forward from the facts that it possesses. In MYCIN, goals were achieved by breaking them down into subgoals to which operators could be applied. Searching for a solution by backward reasoning is generally more focused than forward chaining, as we saw earlier, since one only considers potentially relevant facts.
MYCIN's control structure uses an AND/OR tree, and is quite simple as AI programs go;
(1) Each subgoal set up is always a generalized form of the original goal. So, if the subgoal is to prove the proposition that the identity of the organism is E. Coli, then the subgoal actually set up is to determine the identity of the organism. This initiates an exhaustive search on a given topic, which collects all of the available evidence about organisms.
(2) Every rule relevant to the goal is used, unless one of them succeeds with certainty. If more than one rule suggest a conclusion about a parameter, such as the nature of the organism, then their results are combined. If the evidence about a hypothesis falls between -0.2 and +0.2, it is regarded as inconclusive, and the answer is treated as unknown.
(3) If the current subgoal is a leaf node, then attempt to satisfy the goal by asking the user for data. Else set up the subgoal for further inference, and go to (1).
The selection of therapy takes place after this diagnostic process has run its course. It consists of two phases: selecting candidate drugs, and then choosing a preferred drug, or combination of drugs, from this list.
Evidence Combination
In MYCIN, two or more rules might draw conclusions about a parameter with different Weights of evidence. Thus one rule might conclude that the organism is E. Coli with a certainty of 0.8, while another might conclude from other data that it is E. Coli with a certainty of 0.5 or – 0.8. In the case of a certainty less than zero, the evidence is actually against the hypothesis.
Let X and Y be the weights derived from the application of different rules. MYCIN combines these weights using the following formula to yield the single certainty factor.
where |X| denotes the absolute value of X.
One can see what is happening on an intuitive basis. If the two pieces of evidence both confirm (or disconfirm) the hypothesis, then confidence in the hypothesis goes up (or down). If the two pieces of evidence are in conflict, then the denominator dampens the effect.
This formula can be applied more than once, if several rules draw conclusions about the same parameter. It is commutative, so it does not matter in what order weights are combined.
IF the identity of the organism is pseudomonas
THEN I recommend therapy from among the following drugs:
1 CCLISTIN (.98)
2 POLYMYXIN (.96)
3 QENTAMICIN (.96)
4 CARBENICILLIN (.65)
5 SULFISOXAZOLE (.64)
A MYCIN therapy rule
The special goal rule at the top of the AND/OR tree does not lead to a conclusion, but instigates actions, assuming that the conditions in the premise are satisfied. At this point, MYCIN's therapy rules for selecting drug treatments come into play; they contain sensitivities information for the various organisms known to the system. A sample therapy rule is given above.
The numbers associated with the drug are the probabilities that a pseudomonas will be sensitive to the indicated drug according to medical statistics. The preferred drug is selected from the list according to criteria, which attempts to screen for contra-indications of the drug and minimize the number of drugs administered, in addition to maximizing sensitivity. The user can go on asking for alternative therapies until MYCIN runs out of options, so the pronouncements of the program are not definitive.
Applications of Expert System
Since the introduction of these early expert systems, the range and depth of applications has broadened dramatically. Applications can now be found in almost all areas of business and government. They include such areas as
Different types of medical diagnoses (internal medicine, pulmonary diseases, infectious, blood diseases, and so on)
Diagnosis of complex electronic and electromechanical system
Diagnosis of diesel electric locomotion systems
Diagnosis of software development projects.
Planning experiments in biology, chemistry, and molecular genetics
Forecasting crop damage
Identification of chemical compound structures and chemical compounds.
Location of faults in computer and communications systems
Scheduling of customer orders, job shop production operations, computer resources for operating system, and various manufacturing tasks.
Evaluation of loan applicants for lending institutions
Assessment of geologic structures from dip meter logs.
Analysis of structural systems for design or as a result of earthquake damage
The optimal configuration of components to meet given specifications for a complex system (like computers or manufacturing facilities)
Estate planning for minimal taxation and other specified goals.
Stock and bond portfolio selection and management
The design of very large scale integration (VLSI) systems
Numerous military applications ranging battlefield assessment to ocean surveillance.
Numerous applications related to space planning and exploration
Numerous areas of law including civil case evaluation, product liability, assault and battery, and general assistance in locating different law precedents.
Planning curricula for students.
Teaching students specialized tasks (like trouble-shooting equipment faults)
Importance of Expert Systems
The value of expert systems was well established by the early 1980s. A number of successful applications had been completed by then and they proved to be cost effective. An example, which illustrates this point well is the diagnostic system developed by the Campbell Soup Company.
Campbell Soup use large sterilizers or cookers to cook soups and other canned products at eight plants located throughout the country. Some of the larger cookers hold up to 68,000 cane of food for short periods of cooking time. When difficult maintenance problems occur with the cookers, the fault must be found and corrected quickly or the batch of foods being prepared will spoil. Until recently, the company had been depending on a single expert to diagnose and cure the more difficult problems, flying him to the site when necessary. Since this individual will retire in a few years taking his expertise with him, the company decided to develop an expert system to diagnose these difficult problems.
After some months of development with assistance from Texas Instruments, the company developed an expert system, which ran on a PC. The system has about 150 rules in its knowledge base to diagnose the more complex cooker problems. The system has also been used to provide training to new maintenance personnel. Cloning multiple copies for each of the eight locations cost the company only a few pennies per copy. Furthermore, the system cannot retire, and its performance can continue to be improved with the addition of more rules. It has already proven to be a real asset to the company. Similar cases now abound in many diverse organizations.
Top
Representing and Using Domain Knowledge
Expert systems are complex AI programs. However, the most widely used way of representing domain knowledge in expert systems is, as a set of production rules, which are often coupled with a frame system that defines the objects that occur in the rules. Let's look at a few additional examples drawn from some other representative expert systems. All the rules we show are English versions of the actual rules that the systems use. Differences among these rules illustrate some of the important differences in the ways that expert systems operate.
Top
RI
RI (sometimes also called XCON) is a program that configures DEC VAX systems. Its rules look like this:
If: The most current active context is distributing massbus devices, and
There is a single-port disk drive that has not been' assigned to a massbus, and
The number of devices that each massbus should support is known, and
There is a massbus that has been assigned at least
One disk drive and that should support additional disk drives and
The type of cable needed to connect the disk drive to the previous device on the massbus is known
then
Assign the disk drive to the massbus.
Notice that Rl's rules, unlike MYCIN's, contain no numeric measures of certainty. In the task domain with which RI deals, it is possible to state exactly the correct thing to be done in each particular set of circumstances (although it may require a relatively complex set of antecedents to do so). One reason for this is that there exists a good deal of human expertise in this area. Another is that since RI is doing a design task (in contrast to the diagnosis task performed by MYCIN), it is not necessary to consider all possible alternatives; one good one is enough. As a result, probabilistic information is not necessary in RI.
PROSPECTOR is a program that provides advice on mineral exploration. Its rules look like this:
If: Magnetite or pyrite in disseminated or vein let form is present
then (2, -4) there is favourable mineralization and texture for the propylitic stage.
In PROSPECTOR, each rule contains two confidence estimates. The first indicates the extent to which the presence of the evidence described in the condition part of the rule suggests the validity of the rule's conclusion. In the PROSPECTOR rule shown above, the number 2 indicates that the presence of the evidence is mildly encouraging. The second-confidence estimate measures the extent to which the evidence is necessary to the validity of the conclusion, or stated another way, the extent to which the lack of the evidence indicates that the conclusion is not valid. In the example rule shown above, the number -4 indicates that the absence of the evidence is strongly discouraging for the conclusion.
DESIGN ADVISOR is a system that critiques chip designs. Its rules look like:
If The sequential 'level count of ELEMENT is greater than 2, UNLESS the signal of ELEMENT is resetable
then Critique for poor resetability
DEFEAT Poor resetability of ELEMENT
due to Sequential level count of ELEMENT greater than 2
by ELEMENT is directly resetable
The DESIGN ADVISOR gives advice to a chip designer, who can accept or reject the advice. If the advice is rejected, then system can exploit a justification-based truth maintenance system to revise its model of the circuit. The first rule shown here says that an element should be criticized for poor resetability if its sequential level count is greater than two, unless its signal is currently believed to be resetable. Resetability is a fairly common condition, so it is mentioned explicitly in this first rule. But there is also a much less common condition, called direct resetability. The DESIGN ADVISOR does not even bother to consider that condition unless it gets in trouble with its advice. At that point, it can exploit the second of the rules shown above. Specifically, if the chip designer rejects a critique about resetability and if that critique was based on a high level count, then the system will attempt to discover (possibly by asking the designer) whether the element is directly resetable. If it is, then the original rule is defeated and the conclusion withdrawn.
Reasoning with the Knowledge
As these example rules have shown, expert systems exploit many of the representation and reasoning mechanisms that we have discussed. Because these programs are usually, written primarily as rule-based systems, forward chaining, backward chaining, or some combination of the two, is usually used. For example, MYCIN used backward chaining to discover what organisms were present; then it used forward chaining to reason from the organisms to a treatment regime. RI, on the other hand, used forward chaining. As the field of expert systems matures, more systems that exploit other kinds of reasoning mechanisms are being developed. The DESIGN ADVISOR is an example of such a system; in addition to exploiting rules, it makes extensive use of a justification-based truth maintenance system.
Expert System Shells
Initially, each expert system that was built was created from scratch, usually in LISP. But, after several systems had been built this way, it became clear that these systems often had a lot in common. In particular, since the systems were constructed as a set of declarative representations (mostly rules) combined with an interpreter for those representations, it was possible to separate the interpreter from the domain-specific knowledge and thus to create a system that could be used to construct new expert systems by adding new knowledge corresponding to the new problem domain. The resulting interpreters are called shells. One influential example of such a shell is EMYCIN (for Empty MYCIN), which was derived from MYCIN.
There are now several commercially available shells that serve as the basis for many of the expert systems currently being built. These shells provide much greater flexibility in representing knowledge and in reasoning with it than MYCIN did. They typically support rules, frames, truth maintenance systems, and a variety of other reasoning mechanisms.
Early expert system shells provided mechanisms for knowledge representation, reasoning, and explanation. Later, tools for knowledge acquisition were added. Expert system shells needed to do something else as well. They needed to make it easy to integrate expert systems with other kinds of programs. Expert systems cannot operate in a vacuum, any more than their human counterparts can. They need access to corporate databases, and access to them needs to be controlled just as it does for other systems. They are often embedded within larger application programs that use primarily conventional programming techniques. So one of the important features that a shell must provide is an easy-to-use interface between an expert system that is written with the shell and a larger, probably more conventional, programming environment.
Student Activity 8.1
Before reading next section, answer the following questions.
What problems does an expert system solve?
Discuss the characteristics and applications of Expert Systems.
Distinguish between RI and MYCIN.
If your answers are correct, then proceed to the next section.
Top
Explanation
In order for an expert system to be an effective tool, people must be able to interact with it easily. To facilitate this interaction, the expert system must have the following two capabilities in addition to the ability to perform its underlying task:
1. Explain its reasoning. In many of the domains in which expert systems operate, people will not accept results unless they have been convinced of the accuracy of the reasoning process that produced those results. This is particularly true, for example, in medicine, where a doctor must accept ultimate responsibility for a diagnosis, even if that diagnosis was arrived at with considerable help from a program. Thus it is important that the reasoning process used in such programs proceed in understandable steps and that enough meta-knowledge (knowledge about the reasoning process) be available so the explanations of those steps can be generated.
2. Acquire new knowledge and modifications of old knowledge. Since expert systems derive their power from the richness of the knowledge bases they exploit, it is extremely important that those knowledge bases be as complete and as accurate as possible. But often there exists no standard codification of that knowledge; rather it exists only inside the heads of human experts. One way to get this knowledge into a program is through interaction with the human expert. Another way is to have the program learn expert behavior from raw data.
TEIRESIAS was the first program to support explanation and knowledge acquisition. TEIRESIAS served as a front-end for the MYCIN expert system. A fragment of a TEIRESIAS-MYCIN conversation with a user (a doctor) is shown in figure 8.4. The program has asked for a piece of information that it needs in order to continue its reasoning. The doctor wants to know why the program wants the information, and later asks how the program arrived at a conclusion that it claimed it had reached.
Is the patient’s illness with Organism-1 a hospital-acquired infection?
**Why
[i.e., Why it is important to determine whether or not the infection with Organism-1 was acquired while the patient was hospitalized?]
[2.0] This will aid in determining the category of
Organism-1. It has already been established that
[2.1] the gram strain of Organism-1 is gramneg, and
[2.2] the morphology of Organism-1 is rod, and
[2.3] the aerobicity of Organism-1 is facultative Therefore if
[2.4] the infection with Organism-1 was not acquired while the patient was hospitalized, then
There is strongly suggestive evidence (.8) that the category of Organism-1 is entero-bacteria.
[Also: there is weakly suggestive evidence (.1) that the identity of Organism-1 is pseudomonas-aeruginosa] [Rule037]
** How [2.3]
[i.e., How was it established that the aerobicity of Organism-1 is facultative?]
The following were used:
[3.1] Rule027 indicated there is strongly suggestive evidence (.8) that the aerobicity of Organism-1 is facultative.
[3.2] Rule033 indicated there is weakly suggestive evidence (.2) that the aerobicity of Organism-1 is not facultative.
Since this gave a cummulative CF of (.6) for facultative, the evidence indicates that the aerobicity of Organism-1 is facultative.
Figure 8.4: A Portion of a Dialogue with TEIRESIAS
An important premise underlying TEIRESIAS's approach to explanation is that the behavior of a program can be explained simply by referring to a trace of the program's execution. There are ways in which this assumption limits the kinds of explanations that can be produced, but it does minimize the overhead involved in generating each explanation. To understand how TEIRESIAS generates explanations of MYCIN's behavior, we need to know how that behavior is structured.
MYCIN attempts to solve its goal of recommending a therapy for a particular patient by first finding the cause of the patient's illness. It uses its production rules to reason backward from goals to clinical observations. To solve the top-level diagnostic goal, it looks for rules whose right sides suggest diseases. It then uses the left sides of those rules (the preconditions) to set up subgoals whose success would enable the rules to be invoked. These subgoals are again matched against rules, and their preconditions are used to set up additional subgoals. Whenever a precondition describes a specific piece of clinical evidence, MYCIN uses that evidence if it already has access to it. Otherwise, it asks the user to provide the information. In order that MYCIN's requests for information will appear coherent to the user, the actual goals that MYCIN sets up are often more general than they need be to satisfy the preconditions of an individual rule. For example, if a precondition specifies that the identity of an organism is X, MYCIN will set up the goal "infer identity." This approach also means that if another rule mentions the organism-1's identity, no further work will be required, since the identity will be known.
We can now return to the trace of TEIRESIAS-MYCIN's behavior shown in Figure above. The first question that the user asks is a "WHY" question, which is assumed to mean, "Why do you need to know that?" Particularly for clinical tests that are either expensive or dangerous, it is important for the doctor to be convinced that the information is really needed before ordering the test. (Requests for sensitive or confidential information present similar difficulties.) Because MYCIN is reasoning backward, the question can easily be answered by examining the goal tree. Doing so provides two kinds of information:
1. What higher-level question might the system be able to answer if it had the requested piece of information? (In this case, it could help determine the category of ORGANISM-1.)
2. What other information does the system already have that makes it think that the requested piece of knowledge would help? (In this case, facts [2.1] to [2.4].)
When TEIRESIAS provides the answer to the first of these questions, the user may be satisfied or may want to follow the reasoning process back even further. The user can do that by asking additional "WHY" questions.
When TEIRESIAS provides the answer to the second of these questions and tells the user what it already believes, the user may want to know the basis for those beliefs. The user can ask this with a "HOW" question, which TEIRESIAS will interpret as "How did you know that?" This question can also be answered by looking at the goal tree and chaining backward from the stated fact to the evidence that allowed a rule that determined the fact to fire. Thus we see that by reasoning backward from its top-level goal and by keeping track of the entire tree that it traverses in the process, TEIRESIAS- MYCIN can do a fairly good job of justifying its reasoning to a human user.
The production system model is very general, and without some restrictions, it is hard to support all the kinds of explanations that a human might want. If we focus on a particular type of problem solving, we can ask more probing questions. For example, SALT is a knowledge acquisition program used to build expert systems that design artifacts through a propose-and-revise strategy. SALT is capable of answering questions like WHY-NOT ("why didn't you assign value x to this parameter?") and WHAT-IF ("what would happen if you did?"). A human might ask" these questions in order to locate incorrect or missing knowledge in the system as a precursor to correcting it. We now turn to ways in which a program such as SALT can support the process of building and refining knowledge.
Student Activity 8.2
Before reading next section, answer the following questions.
What is the role of Expert System shells?
What are the chance TERISTIES of a knowledge acquisition system?
Contrast expert system and neural networks in terms of knowledge representation and knowledge acquisition. Give one domain in which the expert system approach would be more promising and one domain in which the neural network approach is more promising.
If your answers are correct, then proceed to the next section.
Knowledge Acquisition
How are expert systems built? Typically, a knowledge engineer interviews a domain expert to elucidate expert knowledge, which is then translated into rules. After the initial system is built, it must be iteratively refined until it approximates expert-level performance. This process is expensive and time-consuming, so it is worthwhile to look for more automatic ways of constructing expert knowledge bases. While no totally automatic knowledge acquisition systems yet exist, there are many programs that interact with domain experts to extract expert knowledge efficiently. These programs provide support for the following activities:
1. Entering knowledge
2. Maintaining knowledge base consistency
3. Ensuring knowledge base completeness
The most useful knowledge acquisition programs are those that are restricted to a particular problem-solving paradigm, e.g., diagnosis or design. It is important to be able to enumerate the roles that knowledge can play in the problem-solving process. For example, if the paradigm is diagnosis, then the program can structure its knowledge base around symptoms, hypotheses, and causes. It can identify symptoms for which the expert has not yet provided causes. Since one symptom may have multiple causes, the program can ask for knowledge about how to decide when one hypothesis is better than another. If we move to another type of problem solving, say designing artifacts, then these acquisition strategies no longer apply, and we must look for other ways of profitably interacting with an expert. We now examine two knowledge acquisition systems in detail.
MOLE is a knowledge acquisition system for heuristic classification problems, such as diagnosing diseases. In particular, it is used in conjunction with the cover-and-differentiate problem-solving method. An expert system produced by MOLE accepts input data, comes up with a set of candidate explanations or classifications that cover (or explain) the data, then uses differentiating knowledge to determine which one is best. The process is iterative, since explanations must themselves be justified, until ultimate causes are ascertained.
MOLE interacts with a domain expert to produce a knowledge base that a system called MOLE-p (for MOLE-performance) uses to solve problems. The acquisition proceeds through several steps:
1. Initial knowledge base construction. MOLE asks the expert to list common symptoms or complaints that might require diagnosis. For each symptom, MOLE prompts for a list of possible explanations. MOLE then iteratively seeks out higher-level explanations until it comes up with a set of ultimate causes. Whenever an event has multiple explanations, MOLE tries to determine the conditions under which one explanation is correct. The expert provides covering knowledge, that is, the knowledge that a hypothesized event might be the cause of a certain symptom. MOLE then tries to infer anticipatory knowledge, which says that if the hypothesized event does occur, then the symptom will definitely appear. This knowledge allows the system to rule out certain hypotheses on the basis that specific symptoms are absent.
2. Refinement of the knowledge base. MOLE now tries to identify the weaknesses of the knowledge base. One approach is to find holes and prompt the expert to fill them. It is difficult in general, to know whether a knowledge base is complete, so instead MOLE lets the expert watch MOLE-p solving sample problems. Whenever MOLE-p makes an incorrect diagnosis, the expert adds new knowledge. There are several ways in which MOLE-p can reach the wrong conclusion. It may incorrectly reject a hypothesis because it does not feel that the hypothesis is needed to explain any symptom. It may advance a hypothesis because it is needed to explain some otherwise inexplicable hypothesis. Or it may lack differentiating knowledge for choosing between alternative hypotheses.
For example, suppose we have a patient with symptoms A and B. Further suppose that symptom A could be caused by events X and ¥, and that symptom B can be caused by Y and Z. MOLE-p might conclude Y, since it explains both A and B. If the expert indicates that this decision was incorrect, then MOLE will ask what evidence should be used to prefer X and/or Z over Y.
MOLE has been used to build systems that diagnose problems with car engines, problems in steel-rolling mills, and inefficiencies in coal-burning power plants. For MOLE to be applicable, however, it must be possible to preenumerate solutions or classifications. It must also be practical to encode the knowledge in terms of covering and differentiating.
But suppose our task is to design an artifact, for example, an elevator system. It is no longer possible to pre-enumerate all solutions. Instead, we must assign values to a large number of parameters, such as the width of the platform, the type of door, the cable weight, and the Cable strength. These parameters must be consistent with each other, and they must result in a design that satisfies external constraints imposed by cost factors, the type of building involved, and expected payloads.
One problem-solving method useful for design tasks is called propose-and-revise. Propose-and-revise systems build up solutions incrementally. First, the system proposes an extension to the current design. Then it checks whether the extension violates any global or local constraints. Constraint violations are then fixed, and the process repeats. It turns out that domain experts are good at listing overall design constraints and at providing local constraints on individual parameters, but not so good at explaining how to arrive at global solutions. The SALT program provides mechanisms for elucidating this knowledge from the expert.
Like MOLE, SALT builds a dependency network as it converses with the expert. Each node stands for a value of a parameter that must be acquired or generated. There are three kinds of links: contributes-to, constrains, and suggests-revision-of. Associated with the first type of link are procedures that allow SALT to generate a value for one parameter based on the value of another. The second type of link, constrains, rules out certain parameter values. The third link, suggests-revision-of, points to ways in which a constraint violation can be fixed. SALT uses the following heuristics to guide the acquisition process:
1. Every noninput node in the network needs at least one contributes-to link coming into it. If links are missing, the expert is prompted to fill them in.
2. No contributes-to loops are allowed in the network. Without a value for at least one parameter in the loop, it is impossible to compute values for any parameter in that loop. If a loop exists, SALT tries to transform one of the contributes-to links into a constraint link.
3. Constraining links should have suggests-revision-of links associated with them. These include constrains links that are created when dependency loops are broken.
Control knowledge is also important. It is critical that the system propose extensions and revisions that lead toward a design solution. SALT allows the expert to rate revisions in terms of how much trouble they tend to produce.
SALT compiles its dependency network into a set of production rules. As with MOLE, an expert can watch the production system, solve problems and can override the system's decision. At that point, the knowledge base can be changed or the override can be logged for future inspection.
The process of interviewing a human expert to extract expertise presents a number of difficulties, regardless of whether the interview is conducted by a human or by a machine. Experts are surprisingly inarticulate when it comes to how they solve problems. They do not seem to have access to the low-level details of what they do and are especially inadequate suppliers of any type of statistical information. There is, therefore, a great deal of interest in building systems that automatically induce their own rules by looking at sample problems and solutions. With inductive techniques, an expert needs only to provide the conceptual framework for a problem and a set of useful examples.
For example, consider a bank's problem in deciding whether to approve a loan. One approach to automating this task is to interview loan officers in an attempt to extract their domain knowledge. Another approach is to inspect the record of loans the bank has made in the past and then try to generate automatically rules that will maximize the number of good loans and minimize the number of bad ones in the future.
META-DENDRAL was the first program to use learning techniques to construct rules for an expert system automatically. It built rules to be used by DENDRAL, whose job was to determine the structure of complex chemical compounds. META-DENDRAL was able to induce its rules based on a set of mass spectrometry data; it was then able to identify molecular structures with very high accuracy. META-DENDRAL used the version space learning algorithm. Another popular method for automatically constructing expert systems is the induction of decision trees. Decision tree expert systems have been built for assessing consumer credit applications, analyzing hypothyroid conditions, and diagnosing soybean diseases, among many other applications.
Statistical techniques, such as multivariate analysis, provide an alternative approach to building expert-level systems. Unfortunately, statistical methods do not produce concise rules that humans can understand. Therefore it is difficult for them to explain their decisions.
For highly structured problems that require deep causal chains of reasoning, learning techniques are presently inadequate. There is, however, a great deal of research activity in this area.
Summary
l Expert systems use symbolic representations for knowledge (rules, networks, or frames) and perform their inference through symbolic computations that closely resemble manipulations of natural language. An expert system is usually built with the aid of one or more experts, who must be willing to spend a great deal of effort transferring their expertise to the system.
l Expert systems are complex AI programs. However, the most widely used by way of representing domain knowledge in expert systems is, as a set of production rules, which are often coupled with a frame system that defines the objects that occur in the rules
l The most useful knowledge acquisition programs are those that are restricted to a particular problem-solving paradigm, e.g., diagnosis or design
l Transfer of knowledge takes place gradually through many interactions between the expert and the system, The expert will never get the knowledge right or complete the first time.
l The amount of knowledge that is required depends on the task. It may range from forty rules to thousands.
l The choice of control structure for a particular system depends on specific characteristics of the system.
l It is possible to extract the nondomain-specific parts from existing expert systems and use them as tools for building new systems in new domains.
l MYCIN is an expert system, which diagnoses infectious blood diseases and determines a recommended list of therapies for the patient.
l RI (sometimes also called XCON) is a program that configures DEC VAX systems
Subscribe to:
Post Comments (Atom)
1 comment:
whoah this blog is magnificent i like reading
your articles. Stay up the good work! You know, many people are searching round for this info, you could aid them greatly.
Check out my website to get more info about forex,
if you like.
Also visit my web site elliot wave
Post a Comment