Nick Schulz: Philip Tetlock is the Mitchell Professor of Leadership at the University of California, Berkeley, and he is the author of a new book, "Expert Political Judgment, How Good Is It? How Can We Know?" His book is the result of years of research during which he worked with almost 300 experts known for commenting or offering advice on political and economic trends.
He asked them to gauge the likelihood that various events would happen, and the results are surprising. And he's here with us today to talk about those results. Philip, thanks for joining us.
Philip Tetlock: My pleasure.
Schulz: I mentioned that what you found in your research was surprising. Maybe you could tell our readers and listeners just a sort of overview of what you were able to conclude from your research.
Tetlock: Well, surprise is always in the eye of the beholder. It depends on what your theoretical expectations were. If you happen to have a rather jaundiced and cynical view of various study specialists for example, you would probably not be all that surprised. If on the other hand, you had a more upbeat view of what they could deliver, you might be.
Schulz: Maybe you can just sketch out a little bit of what exactly you found in broad terms?
Tetlock: Sure. One of the things we did is we got each expert to make a large number of predictions, so from each expert, we obtained in excess of 100 predictions over varying time spans and varying issues. And as you mentioned before, we have almost 300 experts, so we have a large pool of predictions to work with statistically.
Schulz: And these folks are from across the political spectrum, it's not any one.
Tetlock: They range from Libertarian to Marxist. They range from Boomsters to Doomsters, just about every major theoretical, ideological cleavage you can identify in academia or the real world you'll find in this sample.
So we collect a large sample of predictions from each expert and we have a large number of experts and the experts come from a wide range of points of view. So that allows us to make some inferences about the relative accuracy of various sub-groups of expects.
And there is another technicality I might mention, and that is that we get experts to make predictions in the form of subjective probability judgments ranging from zero to one, so they're not saying this will happen or won't happen, they're attaching likelihoods to various possible events.
So the events might be government debt as a fraction of the GDP will fall below 35 percent, between 35 and 42, and above 42 percent. The predictions are all very precise and we get people to attach precise probabilities to each possible future.
And the futures are mutually exclusive and exhaustive, so as you know from basic probability math, that means the probability should add up to 1.0. So we get a lot of predictions and we analyze them.
And one of the first things we discover is that there is a tendency for experts to claim to know more than they do about the future. So if you look at all those predictions that are assigned 90 percent confidence, you'll find out that events assigned 90 percent confidence don't happen 90 percent of the time. Depending on the sub-group of experts we're talking about, it can happen somewhere between 80 and 60 percent of the time.
So we collected this large batch of predictions from each expert and we have many experts and we have a large enough database that we can look at all those predictions assigned different probability values and we can ask whether or not events actually occurred as often as the probability values say they should have.
So when you look at hundreds of events that have been assigned a probability of 90 percent, then 90 percent of those events occur as specified. Now that gets you around a big logical problem. The big logical problem is most probability predictions are not falsifiable or directly testable with a single event observation, right?
So if you say there's a 90 percent chance of something happening for a specific event, and it doesn't happen, you can always argue that, well, gee, the 10 percent possibility materialized and it happened to live in an unlikely universe.
The only way probability events, probabilistic predictions can be disconfirmed for a single event is when people predict, assign a probability value of (zero) and they say it's impossible and it happens. Or they say it's certain behind a probability value of 1.0, and it doesn't happen.
So what we're trying to do here is make the law of large numbers work for us and get around this big logical obstacle assessing the accuracy of probability judgments of individual events.
Schulz: And so how were you able to do that?
Tetlock: By aggregation. By looking at all those predictions assigned different probability values and observing how frequently events actually occur within each probability value.
We can look at all those events that people said were impossible and they should happen zero percent of the time, right? They're perfectly accurate, but if they're happening 15 or 20 percent of the time, that's more problematic. Similarly, for events that people say are inevitable, they should happen all the time, but they're only happening 75 or 80 percent of the time. That's problematic.
Schulz: And in doing this you have them assign probabilities for the likelihood of various events?
Tetlock: That's correct.
Schulz: And so when you aggregated that, what were you able to determine?
Tetlock: Well, one of the first things you notice in the data is there is a general tendency towards over-confidence. The second thing you notice is that some experts are much more prone to be over-confident than are other experts. And that's where we get into our classification of experts in terms of their styles of reasoning as either hedgehogs or foxes.
Schulz: Right. So you divided your experts into camps using the famous Isaiah Berlin distinction that hedgehogs know one big idea and they confidently stick with it, applying it to any and all or various scenarios, and foxes are more flexible and skeptical. And what did you find in comparing those?
Tetlock: Well, the key finding is a bit on the complex side. It's that if you have a hedgehog style of reasoning and you have a very strong ideological commitment to a point of view and you're making long-term predictions, you're at serious risk of falling off a cliff in terms of your predicted accuracy. You pay a price for having the combination of a very theoretically focused cognitive style, a very strong theoretical set of commitments associated with that style, and looking at the more distant future where there is more opportunity for our biases to come into play.
Schulz: Now when I mentioned at the beginning the findings were somewhat surprising, now they may not have been surprising to you, but I think they may be surprising to some people who think, well, someone's an expert in an area, so they would have better predictive powers than, say, just your average man on the street.
And yet, you found that in the aggregate, at least with your group, that the experts weren't really that much better than say just a random picking possible future scenarios out of a hat. Is that right?
Tetlock: Well, the first thing I'll say on behalf of the experts is that these are pretty hard things to predict. We're talking about predicting economic trends or whether leaders are going to fall from power. These are not easy things to predict.
But you're right, when you compare the aggregate accuracy of experts to some very simple statistical models, you'll find that the experts are hard pressed to do better than very simple baselines.
One of the curious things about my experience in this study is that those experts who tend to be more doubtful that they could predict anything, were actually somewhat better at predicting. So modesty was actually a useful cue for accuracy in this context.
Schulz: And these folks would have been foxes or ...
Tetlock: They're more likely to be foxes. Foxes are more likely to believe that there's a certain amount of uncertainty and indeterminacy in the world, that the world just oscillates sometimes in violent unpredictable ways. And it doesn't really obey any set of laws whereas other schools of thought imply that there are some principals or laws or regulatory forces in history. It's not that foxes don't believe there are any regulatory forces in history. They tend to believe there are probably too many and they're often interacting in ways that make it very complicated to predict.
Schulz: There's a writer and investor named Nicholas Taleb, who has written an influential book in which he claims that there are lots of folks who are successful investors out there that are really just lucky; that success wasn't really through any expertise that they had or anything that they actually have done. It's just that they happen to find themselves, at any given time, at the right end of the statistical bell curve of distribution of success. And he says that part of the reason that we lionize some of these folks is because we are, as he titled his book, "Fooled by Randomness."
Now, you mentioned some of the challenges posed by randomness in your book. Could you discuss that a little bit?
Tetlock: It can be a very difficult thing for human beings to come to grips with that the world they live in has that much randomness in it. The strong form of the argument that Nicholas Taleb advances, if I understand it correctly, is that superstar investors, like Warren Buffet, are analogous to a coin that you've tossed and lands up heads - lands up heads maybe 10 or 15 times, 20 times in a row. Now the likelihood of that happening for any single toss is of course very small, but if you're tossing many, many coins all the time, there of course might be some coins that are going to be four, five or six standard deviations out for it.
So the suggestion is that we're confusing luck with talent. Now a lot of people think that's nonsense. A lot of people look at these people and say, my God, they're geniuses and my book doesn't allow us to tell one way or the other who's right in that particular debate.
I do not speak to that possibility, but what my data suggest is that people systematically exaggerate how much they themselves can predict about the future.
Schulz: You describe yourself in your book as someone for whom classical liberal political ideas have resonated. However the area that you've looked into, a marketplace of ideas, is an area where competition, at least among various idea peddlers, hasn't really yielded better results as we would think the competition should do. Do I have that right? And if that's right, why has this market not worked as a market should?
Tetlock: Well, ever since I've been an undergraduate I've been moved by John Stuart Mill's argument. As a psychologist, I am somewhat skeptical about how efficient the marketplace of ideas is. I think there are a number of grounds for worrying about how the marketplace of ideas works.
And I think that would be true even if you're one of those people that says, well, in the long run, it has to sort out. Well, the long run can be very long and a lot of disasters can occur on route. As John Maynard Keynes said, in the long run, we're all dead.
So one of the big questions when you judge the marketplace of ideas and the ideas that experts are peddling, you know, on op/ed pages or in television or in magazines, everywhere around us, it's a very information dense world that we live in. One of the key questions is, what are people actually selecting experts for?
And one possibility, and I think this is the implicit assumption of many marketplaces of ideas arguments, is that people are focusing solely on accuracy. It's a truth game they're playing. The people are rewarding experts who come closer to being accurate and they're punching experts who are more inaccurate; and in the long run, of course, the inaccurate experts are going to be selected out of the system and we are continuously improving the dialogue.
I don't think there are too many people who are all that confident that we have a continuously improving quality of dialogue in our political system.
It raises the interesting question of why. And one possibility is that people aren't selecting experts primarily on the basis of the truth value of their pronouncements. People are selecting experts on other grounds and what might those other grounds be?
They're suggested in a lovely book Richard Posner wrote on public intellectuals. One is that the experts are providing what he called solidarity goods and the other is experts are providing entertainment goods. Now, a solidarity good is when it doesn't really matter whether the experts are right or wrong. The expert is affirming your values and the expert is doing a good job mashing the other side, affirming your side and making you feel good about your world view. So that would be a kind of a solidarity function for instance, better than a truth-seeking function. And an alternative function, a third function would be simple entertainment value.
One thing, and I think that's to a substantial degree in some of these television programs, there's an attraction to experts who are willing to make relatively extreme and therefore relatively interesting predictions. And one of my favorite examples is the various predictions experts have been making about Saudi Arabia over the last 15 plus years.
If you actually track that literature, you'll find that both in public and in private, experts have been predicting the disintegration of the Saudi regime, the Saudi monarchy pretty continuously. Now no doubt there is going to come a time when the Saudi monarchy does fall, in line with the broken clock theory, you're right at least twice a day. If you keep predicting the destruction of the Saudi monarchy, eventually you're going to be right.
Experts who predicted the disintegration of the Saudi regime have had a tendency to advance very interesting scenarios to bolster that prediction, so they can summon up very vivid images of Islamic colonels in Riyadh taking control and the new Saudi regime having an affinity for Osama Bin Laden and oil supplies being disrupted, gasoline prices going to $10 a gallon.
And there are all sorts of very vivid images they can summon up to catch people's attention and it simply makes much better television than an expert who comes on and says, well, yes, the pessimists are right that there are some sources of serious instability in Saudi Arabia and at some point, the Saudi regime is going to have to change its fundamental ways. But you know what, in the short to medium term, it's usually a pretty bad prediction to bet against authoritarian regimes like this, especially with authoritarian regimes that are sitting on top of huge wads of cash and have a very effective repressive police apparatus at their disposal.
A study against repressive government in the medium to short term tends to be a bad debt because governments control a lot of goodies, especially the Saudis and governments control a lot of sticks as well. So when you take into account the capacity to intimidate and the capacity to co-op, betting against a regime tends to be a bad debt.
So more generally what's going on here is experts who are entertaining tend to be those experts who attach high probabilities to low frequency events and there's a price to be paid for that.
SCHULZ: It sounds like even if you can be consistently wrong or consistently overstate things or understate things, you can still be considered an expert, even among experts, not just in the eyes of television shows or things like that.
Philip Tetlock: Well, that's a good point. One of the other things we looked at was not only aggregate forecasting accuracy, we looked at the willingness of experts to change their minds after they learned whether expected or unexpected events occurred.
Now you might think that when the unexpected occurs, the experts should engage in a significant amount of belief change. And in fact, there are some very simple mathematical models for computing how much experts should change their minds if you get them to make the right configuration of judgments.
And we were actually able to compare the amount that experts changed their minds when the unexpected occurred with how much they should have changed their minds according to this formal mathematical model of rationality. And what you observe is that experts change their minds far less than they should.
Then you ask, well, why? What's the source of the shortfall? Well, there are multiple sources of the shortfall, but most of the shortfall can be explained by the tendency of experts to invoke what we call belief-system pretenses.
So experts who were wrong about the disintegration of Canada for example, who predicted that Canada would be gone by 1997 or maybe 2002, five or ten-year time ranges, those experts could argue, and they might be right quite frankly, that they were just off on timing, that Canada will be gone by 2012, you know?
Schulz: Right. In the long run, Canada is dead.
Tetlock: Right. Well, in the long run, Canada is dead and indeed, if people who predicted the disintegration of other nation states, such as Nigeria and India and Indonesia, I mean, very - made strikingly similar, offered quite strikingly similar defenses.
But of course, they're right sometimes and when they're right, it's really noticeable. When they're right about Yugoslavia or they're right about the Soviet Union, that's much more impressive.
Schulz: One suggestion that you have is that we should try to, to whatever extent possible, quantify predictions. You point out that people get very comfortable making those loose predictions where there is a lot of wiggle room. Is that right?
Tetlock: Well, there's an old saying that you should never put a number and a date in the same prediction. Yes, we find that left to their devices, that experts would prefer to make predictions at a level of generality that makes it virtually impossible after the fact to say how accurate they were.
I think there's a natural human disinclination to advance testable hypotheses and one of the things we worked against in these interviews was that resistance.
Schulz: You discuss in the book a couple of possible ways of trying to push in the directions of greater accountability and one of those is, that you mention is prediction markets. Could you talk about those a little bit and what sort of hope you see there for those sorts of things to influence our public discourse in a positive way?
Tetlock: Well, I would assume that many in your audience are familiar with prediction markets already, but what prediction markets do in a bet is they force us to be more fox like.
If you approach prediction markets in a really hedgehog like spirit, a willingness to take a theory and confidently apply it and as a result, attach pretty high probabilities to relatively low frequency events, you're going to get taken to the cleaners pretty fast. Foxes who have more self critical style, who are more diffident about what they can achieve, are less likely to place radical bets.
So they're more conscious. And so one of the big arguments we got into in the book had to do with whether foxes were just chickens, whether the foxes were doing better than the hedgehogs simply because they were cowardly and they weren't willing ever to make a probability estimate much more than point six or point seven.
And we go through elaborate statistical argument that shows that in part, that's part of the reason why the foxes do better, but it's not the whole story. A very important part of the story is that foxes bring to bare a more self critical pattern of reasoning in which they're willing to admit they're wrong.
Schulz: You propose an idea that you don't flesh out in too much specificity, but you give a sense of what it could entail or might look like to try to help generate greater accountability, which would be some sort of institutional mechanisms, be they in the media or in academia or in think tank world or some combination of all that that might help, for public discourse purposes, put pressure to generate better predictions or more reliable predictions. How would that work exactly?
Tetlock: It's funny you should ask that today because I was interviewed by the Numbers guy who does the column for the Wall Street Journal on line and he did a piece on the book that just came out.
And he was quoting me as saying that it would be a good idea if the Wall Street Journal and the New York Times got together and set out some kind of monitoring mechanism for tracking the accuracy of experts because it would have that credibility of bridging the liberal, conservative divide.
So people couldn't just say, oh, look, that's just the BS the liberals have set up to make the conservatives look bad, or vice versa. So I think that standards for judging judgment in a highly polarized society such as today is extremely difficult. And that's therefore, extremely important to do.
Schulz: And you're hopeful that it could be done though?
Tetlock: I think in principal, it's possible. But the sources of - there are multiple sources of resistance and I'd probably think that if there were a reasonably concerted effort by some foundations or some major media outlets to do it, and if it had some cross ideological support, it wasn't just a liberal or a conservative initiative, but it had some cross ideological support, I think that would be a good thing for society.
Schulz: Are there political implications to your findings of any kind? Not necessarily in a liberal/conservative light, but some other way?
Tetlock: One consistent finding in our work is that neither liberals nor conservatives have a particular advantage over each other in terms of being able to predict trends. And that's true for many other major theoretical breakdowns. The best ...
Schulz: Both sides will accuse you of bias for saying this.
Tetlock: Well, perhaps so, and there is a slight advantage to being a monitor, but a much better data predictor of accuracy in whether you're a moderate, is your style of reasoning.
And whether you - if you have a more self-critical style of reasoning, if you have a greater tolerance of ambiguity, a greater willingness to acknowledge the possibility you might be wrong, you're less likely to go off the cliff.
And what's particularly dangerous, I think, is the combination of a rigid cognitive style and a very strong commitment to a theoretical point of view. You can get away with being relatively extreme if you're a fox, but if you're a hedgehog, you're relatively extreme, that I think has some serious political problems attached to it.
Schulz: Philip, I appreciate you taking the time to talk to us today and really appreciate this and congratulations on fascinating research and a fascinating book.Tetlock: Well, thank you very much.
Click here to listen to an excerpt from this interview.