JUSTIN TEITZEL
Alan Bundy Essay
CP3210 Assignment 1
Abstract
This essay on Alan Bundy and how he has contributed to the area of automatic theorem proving within the field of AI. He has made many contributions, great and small , to the science fields of Mathematics and AI. Theorem provers including - PRESS, Clam, Mollusc Oyster, Mecho and Barnacle have all had the Bundy touch. The contributions of Alan Bundy to the automated reasoning field of artificial intelligence are too numerous to list. Therefore, the most important five have been chosen which are Proof Planning, Theorem Provers, Rippling : (A Heuristic for Guiding Inductive Proofs), Inference Calculus and the founding of the DReaM research group (Discovery and REAsoning in Mathematics).
Introduction
Alan Bundy is currently a professor of automated reasoning at the department of artificial intelligence at the university of Edinburgh. He was born on the 10th of may 1947 in Isleworth, Middlesex, England. He gained a bachelor of science honors first class in mathematics from the university of Leicester in 1968 and completed his PhD in mathematical logic in 1971, also from the university of Leicester.
Alan Bundy's positions held with the Dept of AI at the university of Edinburgh have been numerous. He gained a research fellow ship with the University of Edinburgh after the completion of his PhD in 1971. Prof. Bundy held that position until he was promoted to lecturer in 1974, a position which he held, for the period of ten years until 1984. He was a reader for the period spanning from 1984 until 1987. From 1987 until 1990 professor Bundy held a professorial fellowship at the university of Edinburgh while he also held a SERC senior fellowship form 1987 until 1992.He was then promoted to the title professor of automated reasoning which he has held ever since.
Since Professor Bundy started work at the University of Edinburgh, the department of AI has under gone two name changes. Initially it was the Metamathematics Unit, which in 1972 became the Department of Computational Logic, and in 1974 was absorbed into the Department of Artificial intelligence.
The contributions made by Professor Bundy to science have been
dominated by research in automated reasoning and the ideas
surrounding it. Automatic Reasoning is the mathematical reasoning
is applied to design computer programs to automatically prove a
theorem. One of the main research ideas involved in this field is
proof planning.
Proof Planning
Mathematical theorems come from a range of different families. They may have different mathematical ways of being proved as a theorem (mathematical induction etc.) however the general requirements to gain a proof remain the same. Proofs require a search through the infinite ways rules can be applied while trying to prove a theorem.
Using automated reasoning, proofs can be arrived at by one of two different ways. Using logical rules and applying them to the axioms to get the theorem, is one method. Another method is assuming the theorem and applying the rules to it backwards to break it down and hopefully arrive at axioms. Each method has it's pros and cons and may be the only one able to be used in a certain circumstance.
The problem with automated theorem proving is the combinational explosion. The combinational explosion problem occurs when there is more than one choice can be taken while trying to prove a theorem. A search must be used to try all choices, known as an exhaustive search, which can lead to problems. The reason for these problems can be quoted as [1] "The number of intermediate expressions we must generate grows super-exponentially with the length of the desired proof ". In essence this means the exhaustive search is not feasible due to the time and storage required to prove a theorem.
There are two levels of inferencing in a theorem prover. There is the object level which contains domain knowledge and a meta level which contains strategic control knowledge. When conducting the search if it is done at the meta level ( use a proof plan ) instead of the object level it reduces the exponential complexity due to the meta level search space having a smaller branching rate. That is where proof plans come in. The three parts of a proof plan combine to give the basic description of the meta level so the search can then be conducted within the meta level.
Proof plans are used to guide the search through all possible paths when searching for a proof in automatic theorem proving. Theorem provers involve using the methods from proof plan as a specification of tactics available to solve the problem. The specification is recorded in sorted meta logic which makes it easier to reason through the theorems to be proved and the methods that can be used to prove them. Alan Bundy's proof planning consists of three kinds of object: Tactics, Methods and Critics, for the conversion to the meta level. In proof plans, the meta level is used to identify which tactics are best suited to the problem at hand.
Tactics can best be explained as the justification of a step in a proof there are several types of tactics simple tactics and composite tactics. A simple tactic might only contain one rule of inference. A composite tactic contains several simple tactics and can be the whole proof. Tactics are programs which construct part of a proof by using rules of inference.
The next object involved in the planning process is methods. They can be split into two areas preconditions, which describe how and when the tactics can be used and effects which describes what effect the tactic has on the logical expressions it manipulates. These are also know as the syntactic properties of the manipulated logical expressions and are expressed in a meta-logic.
The third idea involved in proof planning is critics. They identify any patterns in the methods which failed. They also suggest how to correct the partial proof. They are similar in the structure of methods except instead of telling you where a tactic can be used they tell you where it fails and instead of the effect it had on the proof, it suggest how to remedy the proof plan.
A proof plan [1]"captures the common patterns of
reasoning in families of similar proofs". This quote leads
to the realization that different types theorems of mathematics
require different ways of coming up with a proof for the theorem.
An example of this is the use rippling in constructing the proof
plan for inductive proofs.
Rippling
Rippling is a technique that is used to further enhance proof plans, to encompass inductive proofs into proof planning. Rippling, as described by Alan Bundy in research paper 567, is [2]" A tactic for the heuristic control of the key part of proofs by mathematical induction. To understand rippling the idea behind Mathematical Induction must understood.
Mathematical induction is a method of mathematics to prove a theorem . A base case, S1, is stated for which the investigated theorem can be proved. Then you prove that S(k + 1) is true whenever Sk is true. So the proof is broken into two parts, a base case and a number of step cases, and the idea is to prove the step cases are true from the base case. This is the main idea behind induction which has been used by mathematicians for hundreds of years.
The base case which is used in rippling is called the
induction hypothesis and the step cases are known as the
induction conclusion. Wave-fronts are the places where the
induction hypothesis differs from the induction conclusion. These
wave-fronts for each step case can be automatically calculated
through the comparing of the hypothesis and conclusion. These
differences(wave fronts) are then moved out of the way using
wave-rules, which are induction rewrite rules. The rewritten
induction conclusion is the fertilized with the induction
hypothesis. The use of rippling involves little or no search.
Theorem Provers. (PRESS, Oyster and Clam etc.)
The first theorem Bundy implemented was a theorem prover for the axiomatic theory of arithmetic, which was the thesis for his PhD in mathematics. At this stage, however, no proof planning was used in the proving of the theorems, it had not even been though of. The idea of using mathematical logic, proof plans and heuristics, as a basis to gain a deeper understanding of theorem proving and aspects in AI, later occurred to Bundy.
This lead to the development of PRESS (PRolog Equation Solving System) which is one of the first automatic theorem provers constructed by Alan Bundy. It is the first theorem prover which used his proof planning ideas of meta knowledge and the meta level. The ideas used in this program were an important area which has since been expanded upon.
The Clam, Oyster programs can be considered a pair in
automated reasoning. Clam is a proof planning program and the
Oyster program is a proof editor . They interact with each other
through the concept of tactics. The planning phase in Clam helps
reduce the search phase in Oyster. Clam is used to construct a
proof plan that is relevant to the type of proof which is being
found. As with the specification of the proof planning process,
the domain knowledge of the theorem from the object level is
converted to strategic control knowledge through the use of
tactics, methods and critics. The use of this strategic control
knowledge by the Oyster to reduce the search. Advancement in
expert systems has been enormous since Alan Bundy's ideas, meta
level inference and proof planning, have applied to the structure
of some of these programs.
Incidence Calculus
The invention Incidence Calculus, which is used as a logic for uncertain reasoning, can be attributed to Alan Bundy and the DReaM research group. [3] "In incidence calculus, inferences are usually made by calculating incidences sets and probabilities of formulae based on a given incidence function in an incidence calculus theory". Inference calculus is used when there is an uncertainty in a proof. It involves the use of two sets , a set containing the propositions and a set of all possible worlds with a probability distribution on them. The set containing the propositions is associated through a set of possible worlds.
DReaM Research Group
Alan Bundy is also the founder of the DReaM research group (Discovery and REAsoning in Mathematics). Through the invention and use of programs such as PRESS and Oyster and Clam, as discussed previously, the DReaM research group has been able to find a way to explicitly represent search control information as a theory constructed of axioms. This enables the search control information to be used during deductive inference. This is also known as meta level inference.
The DReaM research group has also focused on areas of mathematical reasoning other than deduction and has contributed several programs and new ideas to these areas. One of these areas is finding formal representation of an informally stated problem. The contributed software are the programs Mecho, Eco, LP and Extended-EBG while the new ideas generated are incidence calculus and generating new proofs from the proofs of old theorems.
Related Research
Under the university of Edinburgh DReaM group a lot of computer scientists have followed Bundy's research. From this group, many people have also had the chance to work with Professor Bundy on numerous research projects. A member of the DReaM group, Raul Monroy-Borja, who is currently a PhD student at the University of Edinburgh, has used Alan Bundy's work on proof plans as a basis for a research interest. His research interest is [4]"The use of proof plans to exploit any failure or partial success in the search for a proof, especially for the detection, location, and correction of faults in mathematical conjectures , and formal specifications".
Conclusion
This essay has given a brief history of Professor Alan Bundy and his contributions to Artificial Intelligence. With the best part of his life being spent in research of the AI field of automated reasoning, Professor Bundy has contributed many experimental ideas to the field. The most important of these contributions being listed and described as Proof Planning, PRESS - Clam and Oyster , Rippling, Incidence Calculus and the founding of the Artificial Intelligence research group Dream. All of these ideas and programs being solid basis' for further research into automated reasoning.
Alan Bundy's research has also sparked interest into up and coming computer science students who wish to expand on his work. One student in particular , Raul Monroy-Borja , has an interest in extending Bundy's research on Proof planning. The thesis of his research being [4]"The use of proof plans to exploit any failure or partial success in the search for a proof, especially for the detection, location, and correction of faults in mathematical conjectures , and formal specifications". This is just one of the many students and academics in the field of computer science that have an active interest in the work of Professor Alan Bundy.
References
[1] A. Bundy - Proof Planning, ftp://dream.dai.ed.ac.uk/pub/papers/PS/proof-plans.ps
[2] A. Bundy, A. Stevens, F. van Harmelen, A. Ireland, and A. Smaill. Rippling: A heuristic for guiding inductive proofs. Artificial Intelligence, 62:185-253, 1993. Also available from Edinburgh as DAI Research Paper No. 567.
[3] Liu, W., A. Bundy and D. Robertson (1993a) Recovering
incidence functions. In Proc. of the 2nd
European conference on symbolic and quantitative approaches to
reasoning and uncertainty,
Granada, November 1993, Spain. Lecture Notes in Computer Science
747, pp 241-248, Springer.
Full paper is available as Department Research Paper 648, Dept.
of AI, Univ. of Edinburgh.
[4] http://dream.dai.ed.ac.uk/group/raulm/raulm.html
[5] A. Bundy, F. van Harmelen, J. Hesketh, and A. Smaill.
Experiments with proof plans for induction.
Journal of Automated Reasoning, 7:303-324, 1991. Earlier version
available from Edinburgh as
DAI Research Paper No 413.
[6] W. Liu, A. Bundy, and D. Robertson. On the relations
between incidence calculus and ATMS. In
European Conference on Symbolic and Quantitative Approaches to
Reasoning and Uncertainty,
Lecture notes in Computer Science 747, pages 249-256. Springer,
1993. Also available as Dept
Research Paper 611 and appeared in the IJCAI-93 workshop on
management of uncertainty.
[7] F. van Harmelen. The CLAM proof planner, user manual and
programmer manual: version 1.4.
Technical Paper TP-4, DAI, 1989.
[8] A. Bundy, F. van Harmelen, C. Horn, and A. Smaill. The
Oyster-Clam system. In M.E. Stickel,
editor, 10th International Conference on Automated Deduction,
pages 647-648. Springer-Verlag, 1990. Lecture Notes in Artificial
Intelligence No.449. Also available from Edinburgh as DAI
Research Paper 507.
[9] A.Bundy Curriculum Vitae available from
http://www.dai.ed.ac.uk/daidb/people/homes/bundy/
Student: Aaron Nielsen
Subject: CP3210 - Fundamentals of Artificial Intelligence
Due Date: Friday, 27 March 1998
Lecturer: Geoff Sutcliffe
Department of Computer Science, JCUNQ
Table of Contents:
Over the course of his long and illustrious career in research, the work of John Henry Holland (circa. 1921 - ) has appeared in a number of different contexts. His graduate dissertation was in physics at the Massachusetts Institute of Techn
ology (MIT), performing numerical analysis using the then-classified "Whirlwind, the first so-called real-time computer" [1], and his rare experience led him directly to IBM to help design their first large-scale computer
. In 1958, at the University of Michigan, he abandoned his Ph.D. in mathematics for one in computer science, which he obtained a year later and which is still believed to be the first of its kind to be awarded in the United States. Since then, his resea
rch has led him to the more abstract fields of philosophy, psychology and biology, though he has alway maintained the basis that he established for himself in mathematics and computing. He currently holds a number of titles, including two appointments in
different departments at the University of Michigan, the chair of the Santa Fe Institute Steering Committee (alongside Murray Gell-Mann), and a fellowship of the World Economic Forum.
This may seem to be a complicated combination, but only because it is, and impossibly so. Holland’s grand scheme of artificial intelligence goes far beyond machines that can simply learn: his work is leading to computer-based agents that evolve over tim e, that can truly experience their surroundings, that can learn from their mistakes and that can, ultimately, produce offspring of the same type and with similar properties to its parents. A more adequate description of the fruits of Holland’s labour wou ld be the term artificial life - but without meaning the creation of living and breathing mechanical organisms as such. Artificial life, as Holland sees it, is moreover a field which brings a limitless number of facets of everyday behaviour to a s ingle common denominator and studies its dynamics at a more general level.
2. Holland’s contributions to research into artificial intelligence
During his work at IBM, Holland teamed up with Arthur Samuel, who was also interested in the prospect of machine learning. It was allegedly at this time that Holland saw the value of parallel computing and dividing complicated tasks
into smaller and more digestible portions. His spare-time project at IBM resulted in the idea of neural networks, which attempt to simulate pattern recognition in a living brain. [2]
Much of Holland’s work has been directed towards an approach to artificial life which he has called genetic algorithms, which allow objects in a computer system to evolve and adapt to changes around them. Holland invented genetic algorithms and ha s since developed various sub-families of genetic systems: he created adaptive systems, in which individual objects (or agents) can combine with each other to produce offspring and to simulate evolution; he took the idea further and came up with classifier systems, which attempt to experience their environment and learn from their mistakes. His next ambition in the field is to build a seed machine to simulate the growth and reproduction of a single organism, in the same way t hat a plant grows from a seed, and still taking adaptation into account with each generation.
Holland is accredited with the notion of parallel computing, whereby a number of individual systems and/or sub-systems work together to solve a given problem. The principle of parallel computing is fairly simple: if a problem can be divided into
multiple distinct parts, give each of these parts to a different computer to solve, solve the parts simultaneously and have the results coalesce at the end. Parallel computation was an exciting idea, but it does little to change the face of the problem a
t hand, beyond speeding things up.
Parallel computing led in turn to the invention of neural networks, in which the individual nodes in the network work more closely with each other. Neural networks are an attempt to simulate what goes on in the human brain, an impossibly co
mplex connection of receptive nerves (or neurones) which react to the environment in different ways. Patterns are stored as they are sensed, and since similar experiences produce similar stimuli in the neurones, both collectively and individually,
cognitive systems based on neural networks are very good at learning and (later) recognising patterns. For example, neural networks have been developed to recognise photographs of people and to discern individual objects in a visual scene or landscape p
hotograph.
Neural networks pose a number of problems for the AI traditionalists: the hardware is still expensive; the way in which neural networks learn and perceive these patterns is very abstract; recognition can be inaccurate if the system is not taught as compl
etely as possible; and learning can only take place within specific domains of knowledge. A neural network can be taught to recognise a picture of a tree, given a number of examples of pictures of trees, but it will not necessarily recognise a species of
tree that it has not seen before. According to Holland, this lack of understanding of neural networks is due in part to a lack of understanding of human perception - and while the brain itself can be "rebuilt", in a way, its procedures (i.e. how it actu
ally learns and remembers) are more of a mystery. Until mankind understands the human learning process better, and until neural networks can be produced en masse, neural networks are likely to remain at a standstill, or at best, to advance in small and u
neasy steps. [3]
The initial result of this was a new invention: genetic algorithms. Genetic algorithms are based around a number of individual agents that try to solve a certain problem. Each agent possesses a number of characteristics and parameters with which
to attempt to solve the problem. At first, the agents attack the given problem in a very "trial and error" manner, but as time passes and as more guesses are made, the guesses draw closer and closer to a correct answer. The difference between the first
"genetic" programs and sheer random guessing was that the more capable (or "fitter") agents could combine and reproduce, spawning a new generation of problem-solving agents that would grow up, as it were, to be even more able to solve the problem than th
eir parents.
The building blocks of each agent were the initial parameters that it was given, much like the DNA that builds living beings. Of course, the problem-solving agents were much simpler than living creatures, and these building blocks were simply stored as a
short string of characters. [4] Furthermore, while the less "fit" agents would continue to fumble their way through the problem, the process of reproduction was only undertaken by the fittest agents. This way, success
ive generations of agents would emulate the biological concept of natural selection and the fitter agents would bring the system ever closer to a solution. Even at this early stage, the principles of genetic algorithms allowed for some degree of r
andomness in the characteristics of the offspring, generating individuality (including mutation, on rare occasions) within each generation.
The results obtained at the Santa Fe Institute have been both fascinating and scary, exhibiting many patterns evident in human society. For example, agents of similar inclinations can flock together, establishing larger communities within which smaller g
roups of agents can specialise further, or forming a heterogeneous collective that can handle a number of different tasks. On the other hand, a certain resource may only be found in small quantities, but may be required for the purpose of reproduction, c
reating an "arms race" of sorts for the rare resources which may even result in conflict en masse.
At its heart, Echo is a simple idea, but it can lead to some very abstract, but nonetheless powerful, representations of problems in various fields: the adaptive system model can be used to simulate the distribution of resources within a city or a
mong cities, or the spread of a disease among the cells of a living organism. It highlights patterns in behaviour within a community at a very general level, without bogging the system down with specifics or fine details. Holland published a number of p
apers leading up to the finished product, including Adaptation in Natural and Artificial Systems, published by MIT press in 1992. The implementation of Echo discussed by the Santa Fe Institute in general is closest to Holland’s descriptions
in Proposal for a Research Program in Adaptive Computation, also
published in 1992.
The invention of the classifier system is perhaps Holland’s reaction to the traditional "expert" system. An expert system is a knowledge-based computer program or system that is designed to know about, and answer questions about, a certain area of
expertise, such as medical diagnostics. Holland sees that classifier systems could easily provide a better alternative to knowledge-based expert systems, as the latter are rather inflexible and not built for change, nor necessarily for new knowledge. W
ith genetic algorithms at its core, a classifier system would always be ready and willing to accept new knowledge and even new approaches to learning, whether they arise from the logical consequences of its existing knowledge or from new experiences.
Although this is a common biological process, required for sexual and asexual reproduction alike, biology has yet to find a good explanation for the process. According to Holland, the problem is merely one of combining other ideas: an initial cell; some
means that not only allows adhesion for subsequent cells, but also some sort of structure or layering; and some resultant stage at which the organism can produce another initial seed.
At the time of print, the seed machine has not yet been created. Holland believes that the Echo framework is already powerful enough for him to build a seed machine, although such a result would be some years away.
Because of the complicated and vague nature of genetic algorithms and other "artificial life" concepts, it is difficult to tell at this stage where the theories of Holland and others will lead. However, by the same token, it is difficult to tell w
here they will not lead - the idea of dynamic, adaptive and evolutionary computer systems has many applications in all fields of study. The Echo computer system provides a good example of how evolution takes place, based on Holland’s concept of complex adaptive systems, while he has worked with others at the Santa Fe Institute to simulate stock brokers in an artificial stock market. [6] Holland expects that a number of very r
eal-world problems, such as economics and disease, will remain largely problematic without a greater understanding of their fundamentals - for which his ideas provide a unique tool.
The veritable father of artificial life, Holland has brought his grand scheme of things a long way. His insights have shed an entirely new and interesting light on a number of issues, both in the real world and in theory, that people simply hadn’t
thought about in the same way before. Holland has spent his life teaching computers about complex issues in simple terms - a lesson that he could easily have taught to human beings directly.
Of course, he still has a lot more ground to cover and a lot more work to do - and how far his work will go from here is anybody’s guess. But that, as they say, is life.
2.1. Parallel computing - first steps into AI
During the course of his work at IBM in the 1950s, Holland took his first steps into the field of machine learning, which is very much a fundamental criterion for evaluating an artificially intelligent agent. Indeed, a computer system could hardly
be deemed intelligent if it could not learn from experience. For Holland, this was an opportunity to combine his interests in biology and psychology to his more solid background in mathematics and computer science and to attempt to put some of them into
some sort of practice.
2.2. Genetic algorithms - a new approach
Most of Holland’s work after his stint at IBM was directed towards the process of thought, rather than on the structure of the brain. He had read the work of Donald Hebb and R. A. Fisher, from which he set out to put his theories of learnin
g into practice, but rather than trying to build an electronic brain, he wanted to study thought and learning and to implement his studies on a computer.
2.3. Complex adaptive systems - simulating life itself
In 1962, Holland published his Outline of a Logical Theory of Adaptive Systems, the first published paper to link the active process of thinking with the gradual process of evolution. In keeping with his previous ideas about genetic systems, he co
ined the term complex adaptive system to describe an active and perpetual system in which problems are solved. Agents could still interact with their environment, i.e. with a given problem, and could still interact with each other to produce more
agents. Now, though, the level of interaction was not restricted to these two tasks alone: the next step was to try to make the logical agents more complicated, more individual, more like people.
The prospect of complex adaptive systems was put into practice at the Santa Fe Institute, whose research resulted in a computer program called Echo. The Echo system allows for some interesting possibilities for agents: the virtual world wi
th which the agents interact is divided into localities (or sites); an agent can draw resources from these sites, with which they can reproduce asexually; and a set of rules decide whether agents meet to trade resources, to fight over resources or
to reproduce sexually. From fairly simple and fundamental beginnings, the system is left to act over time to observe the patterns of "artificial life".
2.4. Classifier systems - teaching a computer (not) to make mistakes
Even adaptive systems have their limits, according to Holland - while they have their own rules for adaptation and evolution, these rules are fixed and dictate the fashion in which the system, or the individual components thereof, can act. Holland has al
ready taken the idea of adaptive systems one step further, to reveal an even more ominous beast known as a classifier system. In a way, a classifier system is a special case of the complex adaptive system, where the characteristics of each agent a
re, in fact, the rules that govern actions, interactions and learning, thereby adding another layer of complexity to the system. Different rules and data work on different aspects of the question at hand, and as per an adaptive system, the weaker knowled
ge succumbs to natural selection and competition, only resurfacing when the newer and/or seemingly more apt knowledge fails and the system is "mistaken" at first.
2.5. Seed machines - biological growth
The next step for Holland is to produce what he calls a seed machine, which models the growth of a single organism. As the seed grows, it produces another independent cell - another agent within the system, essentially - that works in conju
nction with the first cell to produce two more cells, and so on. This process can be likened to a fertilised egg cell that eventually grows into a new-born child and then into an adult, which in turn aims to create instances of the initial seed to start
the process over.
3. Where Holland’s work is leading
While parallel computing stands as a powerful tool for solving complex problems, it is a relatively untouched area of computing. Most personal computers are built around a single microprocessor, and use a multi-tasking operating system to control
multiple applications at any one time. Because of this, parallel computers remain an expensive alternative, and as a result, they have yet to take off in the mass market. Many supercomputers use multiple processors to great effect: the Cray Y-MP C90 su
percomputer has sixteen processors working in parallel for vector calculations; Intel’s Paragon architecture can use thousands of individual Pentium chips. However, the advent of neural networks has given a new lease of life to parallel computing
and to the study of artificial intelligence in general. Studies into neural networks have undergone somewhat of a resurgence in recent years, because of the correspondence between their structure and that of the human brain and the falling cost of comput
er hardware.
4. In conclusion
At 66 years of age, and having achieved so much, one would expect retirement to be on the agenda for John Holland. While this is not yet so, it has convinced him to abandon most of his teaching duties at Michigan, and to concentrate his efforts mo
reover on recording as many of his ideas as possible for permanent records.
Footnotes
5. References
© Aaron Nielsen 18.iv.1998
Last Modified 20.iv.1998 22:56
|
John McCarthy was born on the fourth of September 1927, in Boston, Massachusetts. He attended Belmont High School, Los Angeles, graduating in 1943. McCarthy studied a Bachelor of Science (majoring in mathematics) at the California Institute of Technology, 1943. Finally he attained a Doctor of Philosophy (majoring in mathematics) at Princeton University, 1951. John McCarthy began his interest in Artificial Intelligence (AI) in 1948. Since then he has contributed in many unique ways to the study of AI. He is even credited with coining the term Artificial Intelligence in 1955 (McCarthy 1998). Today McCarthy is a Professor of Computer Science working for Stanford University. |
![]() |
Fields of Contribution
During John McCarthy’s fifty years of research into artificial intelligence, he has contributed much to the new science of mind.
John McCarthy’s main contribution and research has been in the field of formalisation of common-sense knowledge. That is, ways of giving computers (or intelligent agents) the "common-sense" knowledge that all humans are assumed to possess.
McCarthy also contributed to the field of computational reasoning – the ability of computers to ‘reason’ (or rationally construct knowledge) in order to make decisions.
John McCarthy has also contributed to computer science in areas other than artificial intelligence. He developed the concept of time-sharing in the late fifties and early sixties (McCarthy 1959). He has also done a great deal of research into proving that computer programs meet their specifications.
Important Contributions
Formalising Common-Sense Knowledge:
Most of John McCarthy’s research has been focused on formalising common-sense knowledge. This is a vital contribution to artificial intelligence because it considerably increases the computer’s ability to reason and understand language – an intelligent behaviour.
The basis behind common-sense knowledge is as follows: Common-sense knowledge is knowledge that humans possess and apply readily without being aware that they even possess this knowledge. Many concepts are easy for people to understand because they have such common-sense knowledge regarding the topic. For example, seeing a child taking a wrapped box to a birthday party would invoke our common-sense knowledge of birthday parties and we would effortlessly conclude that the wrapped box was a birthday gift. Without this sort of common-sense knowledge, a computer cannot derive many simple things that even a young child could. Thus McCarthy spent much research on ways of representing common-sense knowledge in formal ways that can be easily understood and interpreted by a computer.
McCarthy’s work on formalising common-sense began essentially when he presented a paper in 1958 titled "Programs with Common Sense". The purpose of the paper was to discuss programs to manipulate common statements of logic in a ‘suitable’ formal language. The central concept behind such a programming language, as said by McCarthy in his paper, is to glean conclusions from a list of premises (or axioms).
McCarthy’s research on common-sense knowledge facilitated the giving to computers the basic knowledge that people take for granted. This allowed the computer to formally understand a larger range of situations and thus intelligently solve more problems.
LISP:
John McCarthy’s research into formalising common-sense knowledge led to his second and arguably most famous contribution to artificial intelligence – LISP. LISP is a contraction for list processor and is a specialised programming language designed to perform tasks that mimic the mental steps a human takes to solve problems.
McCarthy began developing most of the fundamental ideas behind LISP during the Dartmouth Summer Research Project on Artificial Intelligence in 1956. McCarthy spent two years developing these fundamental ideas, during which time he preliminarily implemented some of the ideas in a FORTRAN based language called FLPL. Finally from 1958 to 1962 McCarthy implemented the LISP programming language and began to apply it to artificial intelligence problems. Since the first implementation of LISP in 1962, dozens of versions of LISP have emerged from different sources. For a complete listing of the milestones in the development of LISP language variants since its inception see appendix B. Finally in 1992 an American national standard was agreed upon for an implementation of LISP to be known as Common LISP.
As said earlier, the central concept behind a programming language that could manipulate common-sense knowledge is to be able to glean conclusions from a list of premises (or axioms). This is the central idea embedded in LISP. It can derive conclusions from lists of premises, constructing new knowledge. That is why the language was called list processor (LISP). This new knowledge and derived conclusions can be used ultimately to solve complex logic and reasoning problems.
Today LISP has developed into a very powerful language that can be used to solve computationally intelligent problems. LISP has become "the computer language … most widely used in the field of artificial intelligence" (Gardner, 1985 pp.154). LISP may be superseded eventually by ELEPHANT 2000 (more on this later).
Circumscription:
Another major contribution McCarthy made to artificial intelligence is circumscription. Circumscription is a method for performing nonmonatomic reasoning. Nonmonatomic reasoning is "jumping to conclusions" when there is insufficient information. This can best be explained with an example.
Imagine the computer trying to use a boat to cross a river. If the boat is a rowboat, then the oars must be present and unbroken. Similarly the rowlocks must be present, in working order, and fit the oars, etcetera. The number of qualifications necessary for the correct operation of the rowboat can be enormous, making the rules for using the boat impossible (impractical) to apply (McCarthy 1980). However, being a human, you would often "jump to the conclusion" that the boat is fit-for-its-task unless there is something wrong with it. This is almost like saying the boat is fit for its task unless it is not (a trivial logic statement). However, this allows the computer to assume the boat (or any other object) is useable unless contrary information is provided. Using context (or a lack thereof) to form unsubstantiated but probable conclusions in this way is known as circumscription.
McCarthy first presented a paper on his circumscription method in 1980 titled "Circumscription - A Form of Nonmonotomic Reasoning". The paper detailed McCarthy’s concept of circumscription for performing nonmonatomic reasoning. Such nonmonatomic reasoning had previously been identified, but McCarthy’s conscription method was the first formal method for handling such reasoning (McCarthy 1980). Then in 1986 McCarthy presented a paper titled "Applications of Circumscription to Formalizing Common Sense Knowledge". In this new paper McCarthy presented a refined version of his circumscription formula, as well as presenting many examples of conscription applications.
Ascribing Mental Qualities to Machines:
In 1979, John McCarthy wrote a revolutionary paper titled "Ascribing Mental Qualities to Machines". While this paper provided little or no tangible contribution to artificial intelligence, it did bring about a lot of fresh thought and constructive argument in the science. The paper claimed that it is quite valid (and sometimes necessary) to ascribe certain mental qualities to machines. These qualities included beliefs, intentions, and wants (McCarthy 1979). While McCarthy notes that these qualities are usually not true of the machine, using them as a conceptual tool helps user understanding, interaction, and prediction. For example, when denied funds from an automatic teller-machine (ATM) a customer may state
|
"It thinks I don’t have enough money in my account because it doesn’t yet know about the deposit I made this morning." |
This is a controversial statement – especially because the meaning of thinks is not clearly defined. Many would argue that the statement is an anthropomorphism – a linguistic error in which human qualities are ascribed to non-human objects (McCarthy 1983). McCarthy agrees that such statements are anthropomorphisms, but argues that despite being considered by philosophers as errors, they are useful and practical in understanding machines. In the above example, the customer regards the automatic teller-machine as human and this allows the customer to easily understand why the machine refused his withdrawal. The customer understands that the machine thinks there is no money. But more importantly, this concept of the machine thinking allows the user to predict the machines behaviour by predicting what it is thinking. This also enables the user to conceive how to get around the machine’s error – in this example, to allow the machine time to realise that more money has been deposited.
McCarthy wrote a second paper on the topic titled "The Little Thoughts of Thinking Machines" in 1983. This shorter and much simplified paper was published in the Psychology Today journal and was highly popular (McCarthy 1998).
McCarthy’s paper on Ascribing Mental Qualities to Machines started a dispute over whether thermostats could be considered to have beliefs (McCarthy 1998). While immediately such debates did nothing for artificial intelligence progress, they stimulated many new discussions on artificial intelligence and fostered interest from many potential researches and students.
ELEPHANT 2000:
Perhaps one of McCarthy’s biggest contributions to artificial intelligence is yet to be fully realised. Just as the LISP programming language was a great improvement on logical problem solving, ELEPHANT is expected to be the next major step in the logic-programming field.
McCarthy has not published a paper on his proposed ELEPHANT language. He has however, published a draft proposal for the language on his internet site (McCarthy 1997). ELEPHANT has been designed for programs that interact with people directly. The proposed language will have two major aspects distinguishing it from other languages:
The idea here is that all input and output commands in the language will take the form of direct meaningful speech acts. Valid speech acts will include questions, answers, offers, acceptances, declinations, requests, permissions and promises (McCarthy 1997). For example, using the language, if you wanted to receive input you would optionally make an offer then ask a question. If the program requires something from the user, then it would make a request. In this way the ELEPHANT language emulates very formally the way that people interact and communicate in a business sense. It also creates a very flexible communication system for different programs to talk to each other – just as people do (as opposed to extremely task specific communication systems currently used for programs to talk to each other).
With all of the input and output performed via speech acts, the program will be able to interact with humans very easily, in a way that is more intuitive to the human.
So why is the new language called ELEPHANT? Well McCarthy explains it with a rhyme:
|
"I meant what I said, and I said what I meant. An elephant's faithful, one hundred percent! |
He is referring to the elephant’s memory – that is, elephants are supposed to have memory far superior to other animals and people. And the revolutionary aspect that ELEPHANT will have is the ability to remember previous states. This is against the common programming paradigms which focus on state. That is, a program is in a certain state and then it calls a function or procedure and has moved into a new state. However, the ELEPHANT language will be able to remember all previous states as well events that caused the states to change.
Now this past referencing allows for a few new intelligent behaviours. Specifically it allows the programs to handle queries such as:
|
"What if this had happened last year?" or "What if I’d made the transaction when John was born?" |
Here the language allows the program to examine what the state was at a previous time such as a year ago. Another query that could be solved via ELEPHANT would be something like:
| "How long did John work for us?" |
To solve this, the program must sum all the time intervals during which John was employed. To simply keep the current state – John is employed here or John is not employed here – would be insufficient.
In his draft paper Elephant 2000: A Programming Language Based on Speech Acts, McCarthy explores and demonstrates sets of ways of expressing these sorts of time related problems that do not require many data references and are computationally feasible (McCarthy 1997).
Because of the processing and resources required to implement the above two features (especially the latter), it has not yet become viable to implement the language. However McCarthy remains optimistic that the language will be the next vital step in programmable logic and reasoning.
Other Stimulated Works
In his work, McCarthy has been stimulated, encouraged and assisted by various friends and colleges.
Alan Mathison Turing developed a powerful model of computation known as the Turing Machine. Turing worked extensively on designing a test to determine if a machine demonstrates intelligence. The works of Turing stimulated John McCarthy and he wrote a number of papers based around Turing’s concepts. For example, in 1956 McCarthy wrote a paper titled "The Inversion of Functions Defined by Turing Machines" in which he argued that intellectual problems may be regarded as inverted or partial functions that were defined by Turing machines.
Another artificial intelligence researcher that stimulated McCarthy was Robert Kowalski. Kowalski did a lot of research into logic programming for knowledge representation and problem solving. In 1979 Kowalski wrote a particular paper titled "Logic for Problem Solving" in which he expressed algorithms as a combination of logic and control. McCarthy was impressed by the paper and in 1982 wrote a paper titled "Coloring Maps and the Kowalski Doctrine" that sought to extend upon Kowalski’s paper by examining the complex control structures needed for two sample colouring algorithms.
Another famous artificial intelligence researcher was Marvin Minsky. McCarthy and Minsky both worked for the Massachusetts Institute of Technology. Minsky was a common source of inspiration and encouragement to McCarthy as they conversed and contrasted ideas regarding artificial intelligence. In particular, Minsky helped McCarthy with the implementation of his LISP language from 1958 to 1962.
Apart from other artificial intelligence researchers, McCarthy also notes the following people as valuable supporters and stimulators of his work: Vladimir Lifschitz, Leora Morgenstern and Carolyn Talcott (McCarthy 1997).
Summary
John McCarthy has been researching artificial intelligence for fifty years. In that time he has published many papers and been responsible for numerous advancements in artificial intelligence. McCarthy developed the LISP language for programming logic, as well as proposing the superior ELEPHANT language. He invented the circumscription method of non-monotonic reasoning.
McCarthy received the A. M. Turing award of the Association for Computing Machinery in 1971. He received the first Research Excellence Award of the International Joint Conference on Artificial Intelligence in 1985, the Kyoto Prize of the Inamori Foundation in November 1988, and the National Medal of Science in 1990.
Today McCarthy is a Professor of Computer Science at Stanford University and is known as one of the great founders of modern artificial intelligence.
Bibliography
Gardner, H (1985) The mind’s new science, a history of the cognitive revolution.
Hodges, A (1998) The Alan Turing home page.
WWW:
http://www.wadham.ox.ac.uk/~ahodges/Turing.html
Kantrowitz, M (1998) History: Where did Lisp come from?
WWW:
http://www.faqs.org/faqs/lisp-faq/part2/section-13.html
Kowalski, R (1997) Robert Kowalski.
WWW:
http://www-lp.doc.ic.ac.uk/UserPages/staff/rak/rak.html
McCarthy, J (1956) The Inversion of Functions Defined by Turing Machines, as printed in
Automata Studies. Princeton University Press.
McCarthy, J (1958) Programs with common sense, as printed in Mechanisation of thought
processes, procedings of the symposium of the national physics laboratory.
Her Majesty’s Stationary Office, London, pp. 77-84
McCarthy, J (1959) Memorandum to P. M. Morse proposing time sharing.
WWW:
http://www-formal.stanford.edu/jmc/history/timesharing-memo/timesharing-memo.html
McCarthy, J (1979) Ascribing mental qualities to machines, as published in Formalization
of common sense, papers by John McCarthy. Ablex.
McCarthy, J (1980) Circumscription - a form of nonmonotonic reasoning, as published in Artificial
Intelligence journal volume 13. pp. 27-39
McCarthy, J (1982) Coloring maps and the Kowalksi doctrine.
WWW:
http://www-formal.stanford.edu/jmc/coloring/coloring.html
McCarthy, J (1983) The little thoughts of thinking machines, as published in Psychology today.
WWW:
http://www-formal.stanford.edu/jmc/little/little.html
McCarthy, J (1986) Applications of circumscription to formalizing common sense
knowledge, as published in Artificial Intelligence journal volume 28. pp. 89-116.
McCarthy, J (1997) Elephant 2000: a programming language based on speech acts.
WWW:
http://www-formal.stanford.edu/jmc/elephant/elephant.html
McCarthy, J (1998) John McCarthy's Home Page.
WWW:
http://www-formal.stanford.edu/jmc/index.html
Appendix A: McCarthy’s Professional Experience