Earn upto Rs. 9,000 pm checking Emails. Join now!

Enter your email address:

Delivered by FeedBurner

Sunday, November 11, 2007

Artificial intelligence Section A

Go to this site to get more info or reply here !

Artificial Intelligence: An Introduction
Artificial intelligence (AI) is the study of how to make computers do things that, at the moment, people do better. This definition is, of course, somewhat ephemeral because of its reference to the current state of computer science but the fact remains that most attempt to define complex and widely used terms precisely are exercises in futility. To do this, we propose the above by no means is a universally accepted definition. It as well fails to include some areas of potentially very large impact, namely problems that cannot now be solved well by either computers or people. But it provides a good outline of what constitutes artificial intelligence, and it avoids the philosophical issues that dominate attempts to define the meaning of artificial intelligence. Interestingly, though, it suggests a similarity with philosophy at the same time it is avoiding it.
AI has embraced the larger scientific goal of constructing an information-processing theory of intelligence. If such a science of intelligence could be developed, it could guide the design of intelligent machines as well as explicate intelligent behaviour as it occurs in humans and other animals.
Basic elements of AI and AI application areas
AI Techniques
The problems of Artificial intelligence appear to have very little in common except that they are hard. But to our relief there are varieties of techniques to find the solution of the same. What, then, if anything, can we say about those techniques besides the fact that they manipulate symbols? How could we tell if those techniques might be useful in solving other problems, perhaps ones not traditionally regarded as AI tasks? The rest of this book is an attempt to answer those questions in detail. But before we begin examining closely the individual techniques, it is enlightening to take a broad look at them to see what properties they ought to possess.
Intelligence requires knowledge. To compensate for its one overpowering asset, indispensability, knowledge possesses some less desirable properties, including:
It is voluminous.
It is hard to characterise accurately.
It is constantly changing.
It differs from data by being organized in a way that corresponds to the ways it will be used.
We are forced to conclude that an AI technique is a method that exploits knowledge that should be represented in such a way that:
The knowledge captures generalizations. In other words, it is not necessary to represent separately each individual situation. Instead, situations that share important properties are grouped together. If knowledge does not have this property, inordinate amounts of memory and updating will be required. So we usually call something without this property "data" rather than knowledge.
It can be understood by people who must provide it. Although for many programs, the bulk of the data can be acquired automatically (for example, by taking readings from a variety of instruments), in many AI domains, most of the knowledge a program has, must ultimately be provided by people in terms they understand.
It can easily be modified to correct errors and to reflect changes in the world and in our worldview.
It can be used in a great many situations even if it is not totally accurate or complete.
It can be used to help overcome its own sheer bulk by helping to narrow the range of possibilities that must usually be considered.
It is possible to solve AI problems without using AI techniques. And it is possible to apply AI techniques to the solution of non-AI problems. This is likely to be a good thing to do for problems that possess many of the same characteristics as do AI problems. In order to try to characterize AI techniques in as problem-independent a way as possible, let's look at two very different problems and a series of approaches for solving each of them.
Game playing share the property that people who do them well are considered to be displaying intelligence. Despite this, it appeared initially that computers could perform well act those tasks simply by being fast at exploring a large number of solution paths and selecting the best one and if we apply this rule to day to day life then we can understand that, it is basic rule of problem solving. Almost in every case for every problem in a particular situation we may have various possible solutions but if we want to solve the problem correctly then we have to choose a right path then only we can overcome the problem. Same strategy we adopt in game playing, if we want to be a winner then we have to select right option among the various possible options. By adopting this approach we can design best possible game (AI based). But it may not be winner all the time. We can see this in real life problem for example Deep Blue (name of AI based computer system) is defeated by the Garry Cosparov but next time Deep Blue first was able to defact the world champion. We can understand it by following examples:
Tic- Tac- Toe
In this section, we present a series of three programs to play tic-tac-toe. The programs in this series increase in:
Their complexity.
Their use of generalizations.
The clarity of their knowledge.
The extensibility of their approach.
Thus they move toward being representations of what we call AI techniques.
Program 1
Data Structures
Board: A nine-element vector representing the board, where the elements of the vector correspond to the board positions as follows:

1 2 3
4 5 6
7 8 9
An element contains the value 0 if the corresponding square is blank, I if it is filled with an X, or 2 if it is filled with an O.
Movetable: A large vector of 19,683 elements, each element of which is a nine-element vector. The contents of this vector are chosen specifically to allow the algorithm to work.
The Algorithm
To make a move, do the following:
View the vector Board as a ternary (base three) number. Convert it to a decimal number.
Use the number computed in step 1 as an index into movetable and access the vector stored there.
The vector selected in step 2 represents the way the board will look after the move that should be made. So set Board equal to that vector.
This program is very efficient in terms of time. And, in theory, it could play an optimal game of tic-tac-toe. But it has several disadvantages:
It takes a lot of space to store the table that specifies the correct move to make from each board position.
Someone will have to do a lot of work specifying all the entries in the movetable.
It is very unlikely that all the required movetable entries can be determined and entered without any errors.
If we want to extend the game, say to three dimensions, we would have to start from scratch, and in fact this technique would no longer work at all, since 327 board positions would have to be stored, thus overwhelming present computer memories.
The technique embodied in this program does not appear to meet any of our requirements for a good AI technique. Let’s see if we can do better.
Program 2
Data Structures
Board: A nine-element vector representing the board, as described for Program 1. But instead of using the number 0, 1, or 2 in each element, we store 2 (indicating blank), 3 (indicating X), or 5 (indicating O). An integer indicating which move of the game is about to be played; 1 indicates the first move, 9 the last.

The Algorithm
The main algorithm uses three subprocedures:
Make2: Returns 5 if the center square of the board is blank, that is, if Board[5] = 2. Otherwise, this function returns any blank noncorner square (2,4,6,or 8).
Posswin(p): Returns 0 if player p cannot win on his next move; otherwise, it Returns the number of the square that constitutes a winning move. This function will enable the program both to win and to block the opponent's win. Posswin operates by checking, one at a time, each of the rows, columns, and diagonals. Because of the way values are numbered, it can test an entire row (column or diagonal) to see if it is a possible win by multiplying the values of its squares together. If the product is 18 (3 x 3 x 2), then X can win. If the product is 50 (5 x 5 x 2), then O can win. If we find a winning row, we determine . which element is blank, and return the number of that square.
Go(n): Makes a move in square n. This procedure sets Board[n] to 3 if Turn is odd, or 5 if Turn is even. It also increments Turn by one.
The algorithm has a built-in strategy for each move it may have to make. It makes the odd-numbered moves if it is playing X, the even-numbered moves if it is playing O. The strategy for each turn is as follows:
Turn = l Go(l) (upper left corner).
Turn=2 If Board[5] is blank, Go(5), else Go(1).
Turn=3 If Board[9] is blank, Go(9), else Go(3).
Turn=4 If Posswin(X) is not 0, then Go(Posswin(X)) [i.e. block opponent's win], else Go(Make2).
Turn=5 If Posswin(X) is not 0 then Go(Posswin(X)) [i.e., win] else if Posswin(O) is not 0, then Go(Posswin(O)) [i.e., block win], else if Board[7] is blank, then Go(7), else Go(3). [Here the program is trying to make a fork.]
Turn=6 If Posswin(O) is not 0 then Go(Posswin(O)), else if Posswin(X) is not 0, then Go(Posswin(X)), else Go(Make2).
Turn=7 If Posswin(X) is not 0 then Go(Posswin(X)), else if Posswin(O) is not 0, then Go(Posswin(O)), else go anywhere that is blank.
Turn=8 If Posswin(O) is not 0 then Go(Posswin(O)), else if Posswin(X) is not 0, then Go(Posswin(X)), else go anywhere that is blank.
Turn=9 Same as Turn=7.
This program is not quite as efficient in terms of time as the first one since it has to check several conditions before making each move. But it is a lot more efficient in terms of space. It is also a lot easier to understand the program's strategy or to change the strategy if desired. But the total strategy has still been figured out in advance by the programmer. Any bugs in the programmer's tic-tac-toe playing skill will show up in the program's play. And we still cannot generalize any of the program's knowledge to a different domain, such as three-dimensional tic-tac-toe.
Program 3
This program is identical to Program 2 except for one change in the representation of the board. We again represent the board as a nine-element vector, but this time we assign board positions to vector elements as follows:

8 3 4
1 5 9
6 7 9
Notice that this numbering of the board produces a magic square: all the rows, columns, and diagonals sum to 15. This means that we can simplify the process of checking for a possible win. In addition to marking the board as moves are made, we keep a list, for each player, of the squares in which he or she has played. To check for a possible win for one player, we consider each pair of squares owned by that player and compute the difference between 15 and the sum of the two squares. If this difference is not positive or if it is greater than 9, then the original two squares were not collinear and so can be ignored. Otherwise, if the square representing the difference is blank, a move there will produce a win. Since no player can have more than four squares at a time, there will be many fewer squares examined using this scheme than there were using the more straightforward approach of Program 2. This shows how the choice of representation can have a major impact on the efficiency of a problem-solving program.
This comparison raises an interesting question about the relationship between the way people solve problems and the way computers do. Why do people find the row-scan approach easier while the number-counting approach is more efficient for a computer? We do not know enough about how people work to answer that question completely. One part of the answer is that people are parallel processors and can look at several parts of the board at once, whereas the conventional computer must look at the squares one at a time. Sometimes an investigation of how people solve problems sheds great light on how computers should do so. At other times, the differences in the hardware of the two seem so great that different strategies seem best. As we learn more about problem solving both by people and by machines, we may know better whether the same representations and algorithms are best for both people and machines.
Program 4
Data Structures
BoardPosition: A structure containing a nine-element vector representing the board, a list of board positions that could result from the next move, and a number representing an estimate of how likely the board position is to lead to an ultimate win for the player to move.
The Algorithm
To decide on the next move, look ahead at the board positions that result from each possible move. Decide which position is best (as described below), make the move that leads to that position, and assign the rating of that best move to the current position.
To decide which of a set of board positions is best, do the following for each of them:
See if it is a win. If so, cal1 it the best by giving it the highest possible rating.
Otherwise, consider all the moves the opponent could make next. See which of them is worst for us (by recursively calling this procedure). Assume the opponent will make that move. Whatever rating that move has, assign it to the node we are considering.
The best node is then the one with the highest rating.
This algorithm will look ahead at various sequences of moves in order to find a sequence that leads to a win. It attempts to maximize the likelihood of winning, while assuming that the opponent will try to minimize that likelihood.
This program will require much more time than either of the others since it must search a tree representing all possible move sequences before making each move. But it is superior to the other programs in one very big way: It could be extended to handle games more complicated than tic-tac-toe, for which the exhaustive enumeration approach of the other programs would completely fall apart. It can also be augmented by a variety of specific kinds of knowledge about games and how to play them. For example, instead of considering all possible next moves, it might consider only a subset of them that are determined, by some simple algorithm, to be reasonable. And, instead of following each series of moves until one player wins, it could search for a limited time and evaluate the merit of each resulting board position using some static function.
Program 3 is an example of the use of an AI technique. For very small problems, it is less efficient than a variety of more direct methods. However, it can be used in situations where those methods would fail.

Before reading next section, answer the following questions.
1. What is an AI techniques?
Give an appropriate algorithm for solving in Tic-Tac-Toe problem.
Explain the spectrum from static to AI-based techniques for a problem other than the two discussed in this unit. Think of your own problem or use one of the following:
Translate an English sentence into Hindi.
Teach a child to subtract integers.
If your answers are correct, then proceed to the next section.
Theorem Proving
Theorem proving has the property that people who do them well are considered to be displaying intelligence. The Logic Theorist was an early attempt to prove mathematical theorems. It was able to prove several theorems from the Qussells Principia Mathematica. Gelernters’ theorem prover explored another area of mathematics: geometry. There are three types of problems in A.I. Ignorable problems, in which solution steps can be ignored; recoverable problems in which solution steps can be undone; irrecoverable in which solution steps cannot be undone. Theorem proving falls into the first category i.e. it is ignorable suppose we are trying to solve a theorem, we proceed by first proving a lemma that we think will be useful. Eventually we realize that the lemma is not help at all. In this case we can simply ignore that lemma, and can start from beginning.

Natural Language Processing
Perception and communication are essential components of intelligent behaviour. They provide the ability to effectively interact with our environment. Humans perceive and communicate through their five basic senses of sight, hearing, touch, smell, and taste, and their ability to generate meaningful utterances. Two of the senses, sight and hearing are especially complex and require conscious inferencing. Developing programs that understand natural language and that comprehend visual scenes are two of the most difficult tasks facing AI researchers.
Developing programs that understand a natural language is a difficult problem. Natural languages are large. They contain an infinity of different sentences. No matter how many sentences a person has heard or seen, new ones can always be produced. Also, there is much ambiguity in a natural language. Many words have several meanings such as can, bear, fly, and orange, and sentences can have different meanings in different contexts. This makes the creation of programs that “understand” a natural language, one of the most challenging tasks in AI. It requires that a program transform sentences occurring as part of a dialog into data structures which convey the intended meaning of the sentences to a reasoning program. In general, this means that the reasoning program must know a lot about the structure of the language, the possible semantics, the beliefs and goals of the user, and a great deal of general world knowledge.
Developing programs to understand natural language is important in AI because a natural form of communication with systems is essential for user acceptance. Further more, one of the most critical tests for intelligent behaviour is the ability to communicate effectively. AI programs must be able to communicate with their human counterparts in a natural way, and natural language is one of the most important mediums for that purpose.
Before proceeding further, a definition of understanding as used here should be given. We say a program understand a natural language if it behaves by taking a (predictably) correct or acceptable action in response to the input. For example, we say a child demonstrates understanding if it responds with the correct answer to a question. The action taken need not be an external response. It may simply be the creation of some internal data structures as would occur in learning some new facts. But in any case, the structures created should be meaningful and correctly interact with the world model representation held by the program. In this chapter we explore many of the important issues related to natural language understanding and language generation.
Vision Processing
Accurate machine vision opens up a new realm of computer application. These applications include mobile robot navigation, complex manufacturing tasks, analysis of satellite images, and medical image processing. In this section, we investigate how we can transform raw camera images into useful information about the world.
A video camera provides a computer with an image represented as a two-dimensional grid of intensity levels. Each grid element, or pixel, may store a single bit of information (that is, black/white) or many bits (perhaps a real-valued intensity measure and colour information). A visual image is composed of thousands of pixels. What kinds of things might we want to do with such an image? Here are four operations, in order to increasing complexity:
1. Signal Processing: Enhancing the image, either for human consumption or as input to another program.
2. Measurement Analysis: For images containing a single object, determining the two-dimensional extent of the object depicted.
3. Pattern Recognition: For single-object images, classifying the object into a category drawn from a finite set of possibilities.
4. Image Understanding: For images containing many objects, locating the objects in the image, classifying them, and building a three-dimensional mode of the scene.
Image understanding is the most difficult visual task, and it has been the subject of the most study in AI. While some aspects of image understanding reduce to measurement analysis and pattern recognition, the entire problem remains unsolved, because of difficulties that include the following:
l An image is two-dimensional, while the world is three-dimensional. Some information is necessarily lost when an image is created.

Figure 1.2: An Ambiguous Image
l One image may contain several objects, and some objects may partially occlude others.
l The value of a single pixel is affected by many different phenomena, including the colour of the object, the source of the light, the angle and distance of the camera, the pollution in the air, etc. It is hard to disentangle these effects.
As a result, 2-D images are highly ambiguous. Given a single image, we could construct any number of 3-D worlds that would give rise to the image. For example, consider the ambiguous image of Figure 1.2. It is impossible to decide what 3-D solid it portrays. In order to determine the most likely interpretation of a scene, we have to apply several types of knowledge.
For example, we may invoke knowledge about low-level image features, such as shadows and textures, Figure 1.3 shows how such knowledge can help to disambiguate the image. Having multiple images of the same object can also be useful for recovering 3-D structure. The use of two or more cameras to acquire multiple simultaneous views of an object is called stereo vision. Moving objects (or moving cameras) also supply multiple views. Of course, we must also
possess knowledge about how motion affects images that get produced. Still more information can be gathered with a laser rangefinder, a device that returns an array of distance measures much like sonar does. While rangefinders are still somewhat expensive, integration of visual and range data will soon become commonplace. Integrating different sense modalities is called
sensor fusion. Other image factors we might want to consider include shading, colour, and reflectance.
High-level knowledge is also important for interpreting visual data. For example, consider the ambiguous object at the center of Figure 1.4(a). While no low-level image features can tell us what the object is, the object’s surroundings provide us with top-down expectations. Expectations are critical for interpreting visual scenes. But the preferred interpretations of egg, bacon, and plate reinforce each other mutually, providing the necessary expectations. (Figure 1.3)

Figure 1.3: Using Low-Level Knowledge to Interpret an Image

Figure 1.4: Using High-Level Knowledge to Interpret an Image
Speech Processing
Natural language understanding systems usually accept typed input, but for a number of applications this is not acceptable. Spoken language is a more natural form of communication in many human-computer interfaces. Speech recognition systems have been available for some time, but their limitations have prevented widespread use. Below are five major design issues in speech systems. These issues also provide dimensions along which systems can be compared with one another.
l Speaker Dependence versus Speaker Independence: A speaker-independent system can listen to any speaker and translate the sounds into written text. Speaker independence is hard to achieve because of the wide variations in pitch and accent. It is easier to build a speaker-dependent system, which can be trained on the voice patterns of a single speaker. The system will only work for that one speaker. It can be retrained on another voice, but then it will no longer work for the original speaker.
l Continuous versus Isolated-Word Speech: Interpreting isolated-word speech, in which the speaker pauses between each word, is easier than interpreting continuous speech. This is because boundary effects cause words to be pronounced differently in different contexts. For example, the spoken-phrase “could you” contains a j sound, and despite the fact it contains two words, there is no empty space between them in the speech wave. The ability to recognize continuous speech is very important, however, since humans have difficulty speaking in isolated words.
l Real Time versus Offline Processing: Highly interactive applications require that a sentence be translated into text as it is being spoken, while in other situations, it is permissible to spend minutes in computation. Real-time speeds are hard to achieve, especially when higher-level knowledge is involved.
l Large versus Small Vocabularly: Recognizing utterances that are confined to small vocabularies (e.g., 20 words) is easier than working with large vocabularies (e.g., 20,000 words). A small vocabulary helps to limit the number of word candidates for a given speech segment.
l Broad versus Narrow Grammar: An example of a narrow grammar is the one for phone numbers: S XXX-XXXX, where X is any number between zero and nine.
Still, no speech system is 100 per cent accurate. There has recently been renewed interest in integrating speech recognition and natural language processing in order to overcome the final hurdle. For example, ATNs and unification-based grammars can be used to constrain the hypotheses made by a speech system. Thus far, integration has proved difficult, because natural language grammars do not offer much in the way of constraints.
In the speech recognition literature, there is a quantitative measure of grammar, called perplexity. Perplexity measures the number of words that can legally appear next in the input (on average). The telephone number recognition task has a perplexity of 10, because at any decision point, there are ten alternatives. On a sample 1000-word English task, a word-pair grammar may reduce the perplexity from 1000 down to 60. A bigram grammar may reduce it further, perhaps to 20 (Lee and Hon, 1988).
While natural language grammars accurately predict word categories (such as noun and verb), they say nothing about which words within a category are likely to show up in the input. For example, given the word “the,” a grammar might hypothesize that the next word is either an adjectives or a noun. But this knowledge does us little good when there are thousands of possible adjectives and nouns to choose from. Thus, it is natural to turn to statistical, or collocational, facts about language. For example, if the word “doctor” is recognized, then one might expect to hear the word “nurse” later in the input, but not “Horse”. Collocational data, unlike more complex syntactic and semantic structures, can be extracted automatically from large on-line bodies of text. Ultimately, we want to substitute semantic and discourse information for statistical data. If we know the conversation is about doctors, and if we know that doctors and nurses typically work together, then we should be able to generate the proper expectations. Such a strategy will require large knowledge bases and a deeper understanding of semantics and discourse.
Robots have found numerous applications in industrial settings. Robot manipulators are able to perform simple repetitive task, such as bolting and fitting automobile parts, but these robots are highly task-specific. It is a long-standing goal in robotics to build robots that can be programmed to carry out a wide variety of tasks.
A manipulator is composed of a series of links and joints, usually terminating in an end-effector, which can take the form of a two-pronged gripper, a humanlike hand, or any of a variety of tools. One general manipulation problem is called pick-and-place, in which a robot must grasp an object and move it to a specific location. For example, consider Figure 1.5, where the goal is to place a peg in a hole.

Figure 1.5: A Pick-and-Place Task
There are two main subtasks here. The first is to design a robot motion that ends with the object stably grasped between the two fingers of the robot. Clearly some form of path planning, as discussed above, can be used to move the arm toward the object, but we need to modify the technique when it comes to the fine motion involved in the grasp itself. Here, uncertainty is a critical problem. A robot can never be sure of the precise location of the peg or the arm. Therefore, it would be a mistake to plan a grasp motion in which the gripper is spread only wide enough to permit the peg to pass, as in Figure 1.6(a). A better strategy is to open the gripper wide, then close gradually as the gripper gets near the peg, as in Figure 1.6(b). That way, if the peg turns out to be located some small distance away from where we thought it was, the grasp will still succeed. Although this strategy depends less on precise vision, it requires some tactile sensitivity in order to terminate the grasp. Unless we take special care in designing grasping motions, uncertainty can lead to disasters. For example, should the left side of the gripper touch the peg one second before the right side does, the peg may fall, thus foiling the grasp. Brost (1988) and Mason et al. (1988) give robust algorithms for grasping a wide variety of objects.
After the peg is stably grasped, the robot must place it in the hole. This subtask resembles the path-planning problem, although it is complicated by the fact that moving the peg through 3-D space requires careful orchestration of the arm’s joints. Also, we must seriously consider the problems introduced by uncertainty. Failure will result from even a slight positioning error, because the peg will jam flatly on the outer surface. We slide the peg along the surface, applying downward pressure so that the peg enters the hole at an angle. After this happens, we straighten the peg gradually and push it down into the hole.
This type of motion, which reacts to forces generated by the world, is called compliant motion. Compliant motion is very robust in the face of uncertainty. Humans employ compliant motion in a wide variety of activities, such as writing on chalkboards.

Figure 1.6: Naïve and Clever Strategies for Grasping
So given a pick-and-place problem, how can we automatically generate a sequence of compliant motions? One approach (Lozano-Perez et al., 1984) is to use the familiar problem-solving process of backward chaining. Our initial and goal states for the peg-in-hole problem are represented as points in configuration space. First, we compute the set of points in 2-space from which we are guaranteed to reach the goal state in a single compliant motion, assuming a certain degree of uncertainty in initial position and direction of movement and certain facts about relative friction. Now we use backward chaining to design a set of motions that is guaranteed to get us from the initial state to some point in the goal state’s stored pre-image. Recursively applying this procedure will eventually yield a set of motions that, while individually uncertain, combine to form a guaranteed plan.

Chapter 6
Knowledge-Based Agent
Agents that know about their world and reason about possible courses of action.
May be able to accept new tasks in the form of explicitly described goals. Can learn new knowledge about environment. Needs to know:
current state of world
how to infer unseen properties from percepts
how the world evolves over time
what it wants to achieve
what its actions do in various circumstances
Design of Knowledge-Based Agent
central component is knowledge base (KB)
set of facts about the world
each fact is called a sentence
expressed in knowledge representation language
TELL adds fact
ASK queries
inference mechanism makes use of knowledge
may start with initial background knowledge
agent takes input (TELLS percept) and returns action (ASK)
3 levels of description of KB agent:
knowledge/epistemological - say what agent knows
logical level - knowledge encoded into sentences. Describe agent based on what logical sentences it knows.
implementation level - runs on agent architecture; physical representations of the sentences (e.g., as string, as alist, etc.)
Declarative approach: build knowledge from sentences. Not procedural. May also include a learning approach, for acquiring new sentences. (easier than learning new procedural knowledge?)
153-157 Wumpus Example Interesting
Representation, Reasoning and Logic
Goal of knowledge representation: express knowledge in computer-tractable form so it can be used to help agent perform well. Two aspects:
syntax = configurations of sentences
semantics = world facts to which sentences refer
If syntax and semantics are defined precisely, call the language a logic.
Fact is part of world, representation is inside computer.
Proper reasoning ensures that representation-entails->rep only when facts -follow-> facts (i.e., correspondence between logical entailment in representation and real world).
Goal of KB is to generate new sentences that are necessarily true, given that the old sentences are true. This relation is called entailment. Two ways for inference procedure to operate:
Given KB, it can generate new sentences that are entailed by KB
Given KB and sentence, can report whether or not sentence is entailed by KB
Inference procedure that generates only entailed sentences is called sound or truth preserving.
In the inference procedure:
the record of operation of a sound inference procedure is called a proof
an inference procedure is complete if it can find a proof for any sentence that is entailed. For many KB, the consequences are infinite, completeness is an issue.
proof theory - specifies the reasoning steps that are sound. Inference steps must respect the semantics of sentences.
Programming language
good for algorithms and concrete data structures
not good when don't have complete info, just possibilities
Natural language
goal is communication rather than representation
meaning depends on sentence and context
often no direct representation
Goals for representation
expressive and concise
unambiguous and independent of context
effective inference procedure to make new inferences
First-order logic forms basis of most representation schemes in AI. - Specifics of logical notation not important, main thing is how a precise formal language represents knowledge and how mechanical procedures can operate on knowledge to perform
writer of sentence must provide interpretation. Sentence does not mean anything by itself (but meaning may have been fixed long ago).
Languages are compositional. meaning of sentence is function of meaning of its parts
sentence can be true or false. Depends on interpretation of sentence and state of world
logical inference/deduction
valid sentence/tautology (necesssarily true) iff true under all possible interpretations in all possible worlds A v ~A
satisfiable sentence - iff some interpretation in some world for which it is true (doesn't have to be true right now)
unsatisfiable/contradiction - not satisfiable. e.g., self-contradictory A ^ ~A
Computer inference
doesn't know interpretation of sentences
doesn't know about world except via sentences
can only do symbolic manipulation of sentence representations
we provide interpretation of conclusions (valid sentences reached via inference process)
can handle very complex sentences
formal system for describing state of affairs, including syntax and semantics
proof theory or set of rules for deducing entailments
Propositional logic
symbols represent whole propositions (facts)
use Boolean connectives to generate more complex sentences
little commitment about how things are represented
not very useful
First-order logic
contains objects and predicates on objects (i.e., properties of objects or relationships between them)
includes connectives and quantifiers
Can capture much of what we know about the world
Ontology - nature of reality... True or False in propositional. Properties of objects and relationships between them in first-order.
Temporal logic - set of time points, can reason about time.
Epistemology - states of knowledge. What can we know? For these logics, have three states of belief: believe true, believe false, or be unable to conclude. Some systems will have degrees of belief or assign a degree of truth (fuzzy).
Propositional Logic
Symbols include True/False/prepositional symbol such as P, logical connectives, and parentheses.
A sentence consists of:
logical constants True and False
propositional symbol such as P or Q
symbols inside parentheses are a sentence (P v Q)
^ and/conjunction
v or/disjunction
=>implies/implication. (P ^ Q) => R. premise or antecedent is (P ^ Q). conclusion or consequent is R. Also known as rules or if-then statements.
<=> equivalence/biconditional
~ not/negation. Operates on single sentence.
atomic sentence = single symbol, e.g., P
complex sentence includes connectives and parentheses
literal is atomic sentence or negated atomic sentence
grammar is ambiguous P v Q ^ R is either (PvQ)^R or P v (Q^R)
need precedence: ~ ^ v => <=>
propositional symbol can mean whatever you want. Can be satisfiable but not valid.
true: way the world is. False: way the world is not.
complex sentence has meaning derived from parts. Standard truth tables.
implication doesn't follow intuition about English (false antecedent always means true statement (making no claim when false), no causal relation between antecedent and conclusion is required)
Validity and Inference
Truth tables used to test for valid sentences
For P => C build truth table, if valid (true in every row), then conclude C
Any world in which a sentence is true under a particular interpretation is called a model of that sentence under that interpretation
May be many models for given sentence
More claims -> fewer models
entailment: sentence s is entailed by KB if models of KB are all models of s
P ^ Q is intersection of models of P and models of Q
Rules of inference for prepositional logic
certain recurring patterns of inference can be shown as sound without.
Can then use those patterns, in the form of inference rules, without having to build truth table
a |- b says b is derived from a by inference rule
7 common inference rules (Figure 6.13)
modus ponens: a=>b, a conclude b
and-eliminination: a^b^c can infer any of the conjuncts
and-introduction: a,b,c as separate true statements infers a^b^c
or-introduction: a can infer avbvc etc.
double-negation elimination: ~~a implies a
unit resolution: a v B, ~B, infer a
resolution: avB and ~Bvg infer avg
Checking a set of sentences for satisfiability is NP-complete
monotonic logic - if add new sentences to KB, all sentences entailed by original KB are still entailed. Propositional and first-order logic are monotonic.
if not monotonic, couldn't have any local rules, because rest of KB might affect soundness of inference. Probability theory is not monotonic.
There is a useful class of sentences for which a polynomial-time inference procedure exists.
Horn sentence/Horn clause: P1 ^ P2 ^ ... Pn => Q
Ps and Q are non-negated atoms
Special case 1: True=> Q is equivalent to atomic sentence Q
Special case 2: If Q is false, get ~P1 v ~P2 v … ~Pv
If can write KB as collection of Horn sentences, simple inference procedure: apply Modus Ponens wherever possible until no new inferences remain to be made.
How to represent problems in prepositional logic
Read Wumpus World example
Limitations of propositional logic
too weak. 64 rules just for "don't go forward if wumpus"
doesn't handle change (e.g., agent moving around)
Predicate Logic
The most important knowledge representation language is arguably predicate logic (or strictly, first order predicate logic - there are lots of other logics out there to distinguish between). Predicate logic allows us to represent fairly complex facts about the world, and to derive new facts in a way that guarantees that, if the initial facts were true then so are the conclusions. It is a well understood formal language, with well-defined syntax, semantics and rules of inference.
Review of Propositional Logic
Predicate logic is a development of propositional logic, which should be familiar to you. In proposition logic a fact such as ``Alison likes waffles'' would be represented as a simple atomic proposition. Lets call it P. We can build up more complex expressions (sentences) by combining atomic propositions with the logical connectives and . So if we had the proposition Q representing the fact ``Alison eats waffles'' we could have the facts:
P Q : ``Alison likes waffles or Alison eats waffles''
P Q : ``Alison likes waffles and Alison eats waffles''
Q : ``Alison doesn't eat waffles''
P Q : ``If Alison likes waffles then Alison eats waffles''.
In general, if X and Y are sentences in propositional logic, then so are X Y, X Y, X, X Y, and X Y. So the following are valid sentences in the logic:
P (P Q)
(Q R) P
Propositions can be true or false in the world. An intepretation function assigns, to each proposition, a truth value (ie, true or false). This interpretation function says what is true in the world. We can determine the truth value of arbitrary sentences using truth tables which define the truth values of sentences with logical connectives in terms of the truth values of their component sentences. The truth tables provide a simple semantics for expressions in propositional logic. As sentences can only be true or false, truth tables are very simple, for example:

In order to infer new facts in a logic we need to apply inference rules. The semantics of the logic will define which inference rules are universally valid. One useful inference rule is the following (called modus ponens) but many others are possible:
a, a b
this rule just says that if a b is true, and a is true, then b is necessarily true. we could prove that this rule is valid using truth tables.
Predicate Logic: Syntax
The trouble with propositional logic is that it is not possible to write general statements in it, such as ``Alison eats everything that she likes''. We'd have to have lots of rules, for every different thing that Alison liked. Predicate logic makes such general statements possible.
Sentences in predicate calculus are built up from atomic sentences (not to be confused with Prolog atoms). Atomic sentences consist of a predicate name followed by a number of arguments. These arguments may be any term. Terms may be:
Constant symbols
such as ``alison''.
Variable symbols
such as ``X''. For consistency with Prolog we'll use capital letters to denote variables.
Function expressions
such as ``father(alison)''. Function expressions consist of a functor followed by a number of arguments, which can be arbitrary terms.
This should all seem familiar from our description of Prolog syntax. However, although Prolog is based on predicate logic the way we represent things is slightly different, so the two should not be confused.
So, atomic sentences in predicate logic include the following:
friends(alison, richard)
friends(father(fred), father(joe))
likes(X, richard)
Sentences in predicate logic are constructed (much as in propositional logic) by combining atomic sentences with logical connectives, so the following are all sentences in predicate calculus:
friends(alison, richard) likes(alison, richard)
likes(alison, richard) likes(alison, waffles)
((likes(alison, richard) likes(alison, waffles)) likes(alison, waffles)) likes(alison, richard)
Sentences can also be formed using quantifiers to indicate how any variables in the sentence are to be treated. The two quantifiers in predicate logic are and , so the following are valid sentences:
bird(X) flies(X)
ie, there exists some bird that doesn't fly.
X (person(X) Y loves(X,Y))
ie, every person has something that they love.
A sentence should have all its variables quantified. So strictly, an expression like `` X loves(X, Y)'', though a well formed formula of predicate logic, is not a sentence. Formulae with all their variables quantified are also called closed formulae
Predicate Logic: Semantics
The semantics of predicate logic is defined (as in propositional logic) in terms of the truth values of sentences. Like in propositional logic, we can determine the truth value of any sentence in predicate calculus if we know the truth values of the basic components of that sentence. An interpretation function defines the basic meanings/truth values of the basic components, given some domain of objects that we are concerned with.
In propositional logic we saw that this interpretation function was very simple, just assigning truth values to propositions. However, in predicate calculus we have to deal with predicates, variables and quantifiers, so things get much more complex.
Predicates are dealt with in the following way. If we have, say, a predicate P with 2 arguments, then the meaning of that predicate is defined in terms of a mapping from all possible pairs of objects in the domain to a truth value. So, suppose we have a domain with just three objects in: fred, jim and joe. We can define the meaning of the predicate father in terms of all the pairs of objects for which the father relationship is true - say fred and jim.
The meaning of and is defined again in terms of the set of objects in the domain. X S means that for every object X in the domain, S is true. X S means that for some object X in the domain, S is true. So, X father(fred, X), given our world (domain) of 3 objects (fred, jim, joe), would only be true if father(fred, X) was true for each object. In our interpretation of the father relation this only holds for X=jim, so the whole quantified expression will be false in this interpretation.
This only gives a flavour of how we can give a semantics to expressions in predicate logic. The details are best left to logicians. The important thing is that everything is very precisely defined, so if use predicate logic we should know exactly where we are and what inferences are valid.
. Consider the following three statements:
• Nobody, who really appreciates Beethoven, fails to keep silence while
the Moonlight Sonata is being played.
• Guinea pigs are hopelessly ignorant of music.
• No one, who is hopelessly ignorant of music, ever keeps silence while
the Moonlight Sonata is being played.
(a) Translate the three statements above into first order predicate logic,
using the predicates:
to denote that x is a guinea pig;
to denote that x is hopelessly ignorant of music;
to denote that x keeps silent while the Moonlight Sonata is being played; and
to denote that x really appreciates Beethoven.
Sample Solution:
• ∀x(D(x) → C(x)) — or — ∀x(¬C(x) → ¬D(x))
• ∀x(A(x) → B(x))
• ∀x(C(x) → ¬B(x)) — or — ∀x(B(x) → ¬C(x))
(b) Convert the logical statements from Question 1a into clausal normal
Sample Solution:
{{¬D(x), C(x)}{¬A(x), B(x)}{¬C(x), ¬B(x)}}
(c) Use resolution to show that the statements in Question 1b imply the
conclusion Guinea pigs never really appreciate Beethoven.
Sample Solution:
Conclusion: ∀x(A(x) → ¬D(x)). Negate and convert to clausal form,
yielding two clauses: {A(c)}, {D(c)}, where c is a new Skolem con-
stant. Then apply resolution:

Page 2
D(c)D(x), C(x)C(x), B(x)A(x), B(x)
D(x), B(x)

2. Find the most general unifier for these pairs of terms, or indicate that they
are not unifiable. w, x, y, and z are variables, and a and b are constants.
(a) f(x, x, b) f(a, z, z)
Sample Solution: not unifiable
(b) f(y, g(x), x) f(z, z, a)
Sample Solution: {a/x, g(a)/y, g(a)/z} Note {a/x, g(x)/y, y/z} is
not correct.
(c) f(x, x, a) f(a, z)
Sample Solution: not unifiable
(d) h(f(x), g(y), y) h(z, g(z), w)
Sample Solution: {f(x)/z, f(x)/y, f(x)/w}
(e) f(x, g(x)) f(z, z)
Sample Solution: not unifiable (occurs check)
3. Convert to predicate logic, then clausal form, and determine if the argu-
ment is correct, or build a counter example (interpretation):
There is at least one exercise in this set which is too hard. Of all exercises,
if they are in this set then someone can do them, since the answers have
already been worked out. If anyone can do an exercise, then it is not too
hard. Hence all exercises that are too hard should be left to later.
x is an exercise
x is in this set
x is too hard
x is a person
x should be left to later
C(x, y)
x can do y

Page 3
Sample Solution:
∃x (E(x) ∧ I(x) ∧ H(x))
∀x (E(x) → (I(x) → (∃y (P(y) ∧ C(y, x)))))
∀x ∀y (P(x) ∧ E(y) ∧ C(x, y) → ¬H(y))
Conclusion : ∀x (E(x) ∧ H(x) → L(x))
{E(a)}, {I(a)}, {H(a)}
{¬E(x), ¬I(x), P(f(x))}, {¬E(x), ¬I(x), C(f(x), x)}
{¬P(x), ¬E(y)¬C(x, y)¬H(y)}
Negated conclusion : {E(b)}, {H(b)}, {¬L(b)}
Renaming variables to use different variable names in different clauses (to
avoid confusion), resolution proceeds as follows:

PE(y), C(x,y), H(y)
E(a)E(z),I(z), C(f(z),z)
E(w), I(w), P(f(w))
E(w), I(w), E(y), C(f(w),y), H(y)
E(w), I(w), H(w)
Valid since the hypotheses are inconsistent (note resolution didn’t use
negated conclusion at all in this case).

No comments:


Total Pageviews