### David Mumford

David Mumford is perhaps most famous in the mathematical world for his work in algebraic geometry,which earned him a Fields Medal, and for the lecture notes which became The Red Book of Varieties and Schemes. Here at last for those with a penchant for geometric thinking was a way to get a handle on schemes. He modestly describes his own "small contribution to schemes" was to publish there

I want to look here a little at a couple of talks he has given at prestigious mathematical events. First, there is The Dawning of the Age of Stochasticity delivered at a meeting held in Rome in 1999, Mathematics Towards the Third Millennium. (Though not of course mathematics' third millennium.) After presenting probability and statistics as a component of mathematics dealing with chance, just as geometry deals with our experience of space, analysis the experience of force, algebra that of action, Mumford goes on to say something stronger - that thought itself is better captured by probability theory than by logic. What seems to be evident from the experience of those working in artificial intelligence is that this is right. It's hard to think of a domain other than automated theorem proving where a statistical approach is not winning over a logic based approach. Even here, this might be through insufficient effort. As I mention in chapter 4 of my book, George Polya developed a Bayesian interpretation of mathematical conjecture assessment, and went on to talk about the assessment of the plausibility of potential paths to a proof.

Mumford also points to the increase of stochastic concepts within maths itself, such as random graphs and stochastic differential equations. There are plenty of other examples he could have chosen such as random matrices, random polynomials, and random groups. The most controversial part of the paper is his suggestion that random variables be put "into the very foundations of both logic and mathematics" in order to "arrive at a more complete and more transparent formulation of the stochastic point of view" (p. 11). This may be a step too far, but it's worth considering chance-like phenomena in mathematics. When I was thinking hard about what Bayesianism had to say to mathematics, I became intrigued with the distribution of mathematical constants. Looking at those tables, especially the one for mixed constants, brought to mind Benford's law - that the first digits of data from a random variable covering several orders of magnitude satisfies Prob(first digit = n) = log

Two years after the Dawning paper Mumford wrote Trends in the Profession of Mathematics. A couple of points I like in this are, first, his advocacy of a second criterion to judge someone's contribution to mathematics, other than theorem proving. This is what he calls defining a model. Here one extracts what is essential to a situation by modelling it in such a way that the simplest examples embody what is most significant. Examples he gives are the notion of the homotopy type, the Ising model, and the Korteweg-deVries equation.

Update: Another graphic of over 200 million constants shows more Benford-esque behaviour, although even this one seems to flatten out a little too quickly for const/x. A place to start might be to ask why on the first page of graphics the rationals behave as they do.

...a series of 'doodles', my personal iconic pictures to give a pseudo-geometric feel to the most novel types of schemes. What was at the back of my mind was whether or not I could get my own teacher Zariski to believe in the power of schemes.Mumford swapped fields in the 1980s and has since worked on pattern theory, especially as applied to computer vision, but he continues to contribute to mathematical exposition. His course on Modeling the World with Mathematics looks excellent. I only wish I'd attended something like that in my teens. And Indra's Pearls would be a great book to receive for those a little further up the mathematical ladder.

I want to look here a little at a couple of talks he has given at prestigious mathematical events. First, there is The Dawning of the Age of Stochasticity delivered at a meeting held in Rome in 1999, Mathematics Towards the Third Millennium. (Though not of course mathematics' third millennium.) After presenting probability and statistics as a component of mathematics dealing with chance, just as geometry deals with our experience of space, analysis the experience of force, algebra that of action, Mumford goes on to say something stronger - that thought itself is better captured by probability theory than by logic. What seems to be evident from the experience of those working in artificial intelligence is that this is right. It's hard to think of a domain other than automated theorem proving where a statistical approach is not winning over a logic based approach. Even here, this might be through insufficient effort. As I mention in chapter 4 of my book, George Polya developed a Bayesian interpretation of mathematical conjecture assessment, and went on to talk about the assessment of the plausibility of potential paths to a proof.

Mumford also points to the increase of stochastic concepts within maths itself, such as random graphs and stochastic differential equations. There are plenty of other examples he could have chosen such as random matrices, random polynomials, and random groups. The most controversial part of the paper is his suggestion that random variables be put "into the very foundations of both logic and mathematics" in order to "arrive at a more complete and more transparent formulation of the stochastic point of view" (p. 11). This may be a step too far, but it's worth considering chance-like phenomena in mathematics. When I was thinking hard about what Bayesianism had to say to mathematics, I became intrigued with the distribution of mathematical constants. Looking at those tables, especially the one for mixed constants, brought to mind Benford's law - that the first digits of data from a random variable covering several orders of magnitude satisfies Prob(first digit = n) = log

_{10}((n + 1)/n). Try it for the areas of countries in square kilometers. For example, roughly log_{10}(6/5) or 7.9% of them have areas beginning with a 5. The usual justification for this is that we wouldn't expect the answer to differ had we used a different unit of measure, such as square miles, so we need a distribution of first digits which is invariant under scalar multiplication. A uniform distribution over the fractional part of the logarithms of the data does the trick. But if some categories of mathematical constant also show this distribution, the same argument wouldn't go through. Possibly one could argue that the distribution should have the same form if we worked in bases other than 10. In any case, why the fluctuations? Why that little kink around the early 2000s in the mixed constants table?Two years after the Dawning paper Mumford wrote Trends in the Profession of Mathematics. A couple of points I like in this are, first, his advocacy of a second criterion to judge someone's contribution to mathematics, other than theorem proving. This is what he calls defining a model. Here one extracts what is essential to a situation by modelling it in such a way that the simplest examples embody what is most significant. Examples he gives are the notion of the homotopy type, the Ising model, and the Korteweg-deVries equation.

PhDs and jobs should be awarded for finding a good model as well as for proving a difficult theorem.Second, there is a clear statement of the kind of view of the interaction of pure and applied mathematics that I was driving at in this post:

...there is continuous mixing of pure and applied ideas. A topic, such as the Korteweg-deVries equation, starts out being totally applied; then it stimulates one sort of mathematical analysis, then another. These developments can be entirely pure (e.g. the analysis of commutative rings of ordinary differential operators). Then the pure analysis can give rise to new ways of looking at data in an experimental situation, etc. Topics can be bounced back and forth between pure and applied areas.I think we can safely say that Mumford may claim, along with Thurston, “I do think that my actions have done well in stimulating mathematics.” (‘On Proof and Progress in Mathematics’, Bulletin of the American Mathematical Society, 1994, 30(2): 177).

Update: Another graphic of over 200 million constants shows more Benford-esque behaviour, although even this one seems to flatten out a little too quickly for const/x. A place to start might be to ask why on the first page of graphics the rationals behave as they do.

## 7 Comments:

Prompted by an e-mail from Denis Lomas, perhaps I should qualify what I meant by "...thought itself is better captured by probability theory than by logic. What seems to be evident from the experience of those working in artificial intelligence is that this is right." At the very least I think we can say that to the extent that artificial intelligence/ machine learning has produced effective algorithms, the vast majority of these have used tools and constructions from probability theory and statistics, rather than from logic. Even so, it could be argued that this provides little evidence for the way humans think.

Here is the email david (above) is referring to. Sorry I didn't post it here first.

David

You write:

"Mumford goes on to say something stronger - that thought itself is better captured by probability theory than by logic. What seems to be evident from the experience of those working in artificial intelligence is that this is right."

What seems evident is that none of probability theory, logic, or a combination of the two, has solved the key problem: what happens in the few milliseconds between retina stimulation and object recognition. Probability theorists have been working on it for some decades. (Neural nets was/is essentially a probabilistic approach.) Despite a lot of expenditure of heavy-duty brain power using all the tools at the disposal of vision scientists, a solution still does not appear in sight. (Of course, a lot of interesting mathematics has been developed in the process.) So what algorithmic process yields object recognition remains a mystery. The solution, if it is found, could be more, or less, deterministic than probabilistic. Mumford is betting on probability theory. No one knows.

cheers

dennis

Ok, now you've posted your message. One would have to be careful about what you mean by deterministic. A probabilistic computer vision algorithm could be perfectly deterministic, in the sense that for a given input, there is a given output. It's more about

the mathematical tools used.

But putting it like this, perhaps I'm even less inclined to see the use of probabilistic tools in computer vision as a sign that thought is better captured by probability theory. It's hardly farfetched to imagine modelling parts of visual processing in terms of dynamical systems and attractors, but I doubt this should lead us to say thought is "captured" by dynamical systems theory.

David --

While probability theory has many supporters within the AI community, it is by no means universally supported. I find it very interesting that probability theorists have repeatedly asserted the dominance of their approach over alternatives, and repeatedly been challenged -- by Leibniz in the 17th century, by Johannes Von Kries in the 19th century, by George Shackle in the mid 20th, and by large sections of the AI community (Glen Shafer, Philippe Smets, et al.) in the late 20th century. It is unfortunate that statisticians and probabilists are mostly ignorant of this dissident thread to their history.

It strikes me as significant that these challenges have come from people concerned mainly with domains of action rather than with belief -- law (Leibniz, Shafer), medicine (Von Kries, Smets), economics (Shackle), and AI. It is not all obvious to me that a formalism for representing uncertainty of beliefs is also appropriate for representing uncertainties associated with actions. At the very least, the case that the same formalism is appropriate for both purposes needs to be made, rather than merely assumed.

-- Peter McBurney

Peter,

On some days I'm inclined to take the Dreyfus line and doubt that AI has achieved that much at all, beyond providing a few handy tools. It's so easy to slip into thinking that the *intelligence* is in the tool rather than the user. Collins and Kusch's The Shape of Actions: What Humans and Machines Can Do is a powerful attempt to disabuse us of this foible.

In fact this attitude would square with my narrativist philosophy rather well. On the other hand, I do tend to think that Bayesian-leaning probabilists are most likely to produce good new tools. This has to be generously interpreted to include work like that of Pearl's do-calculus, even if he thinks this makes him only a half-Bayesian.

One definition of the difference between AI and Computer Science is this: AI is about problems which computer scientists have yet to learn how to solve, while Computer Science is about the problems they now know how to solve. When an AI problem is solved, it joins the mainstream of CS, and so it is easy to imagine wrongly that AI has achieved little.

Reflect on that the next time you are a passenger in an airplane landing in fog -- The pilot will be doing little more than you are, since such landings at western airports are now all automated.

-- Peter

Peter,

Fair enough! And I suppose the automated airplane isn't using probabilistic reasoning.

The main point of the Kusch-Collins book is to point to some components of human activities which can't be emulated by machines. These are those involved in the evaluation of the criteria for success. E.g., we say what constitutes a good landing. This isn't to say that with the use of machines we can't push the boundaries of what we take to be successful, smooth landing in the fog, or 1000 digit accuracy in a calculation perhaps, but that it requires a certain kind of community of agents for there to be such criteria. For them, computers don't form such communities.

Post a Comment

<< Home