No Man Is An Island: The Wisdom Of Deliberating Crowds
On March 7, 1907 , the English statistician Francis Galton published a peculiar observation.
At a county fair held in Plymouth, 800 visitors had participated in a competition to guess the weight of an ox. While most people’s estimates were too high or too low — falling an average of 37 lbs. away from the true weight of 1,198 lbs. — the median of everyone’s guess was off by only 9 lbs., or less than 1 percent of the true weight of the ox.
This example illustrates what has come to be known as the “wisdom of crowds” effect. In some cases, the average of a large number of independent estimates can be quite accurate, even when the estimators have no special expertise.
“The average competitor,” Galton wrote of the ox competition, “was probably as well fitted for making a just estimate of the dressed weight of the ox, as an average voter is of judging the merits of most political issues on which he votes.”
The wisdom of crowds capitalizes on the fact that when people make errors, those errors aren’t always the same. Some people will tend to overestimate, and some to underestimate. When enough of these errors are averaged together, they cancel each other out, resulting in a more accurate estimate. That’s why the effect benefits from a large and diverse “crowd.” If people are similar in the sense that they tend to make the same errors, then their errors won’t cancel each other out. A crowd with many overestimators will yield a global average that still falls too high; a crowd with many underestimators will yield a global average that still falls too low.
In more technical terms, the wisdom of crowds requires that people’s estimates be independent. Studies have found that when people can observe the estimates of others, the accuracy of the crowd typically goes down. People’s errors become correlated or dependent, and are less likely to cancel each other out. We follow our peers, to the detriment of the performance of the group.
But a new paper offers an interesting twist on this classic phenomenon. When crowds are further subdivided into smaller “crowds” that are allowed to deliberate about the right answer, they not only succeed in overcoming the costs of introducing dependence, but even outperform the group as a whole.
The new paper, published last month in Nature Human Behavior and authored by Joaquin Navajas and colleagues, reports the results of a large-scale study of estimation. More than 4,000 people attending an event were asked to provide estimates for eight values, such as the height of the Eiffel Tower. They were then subdivided into groups of five estimators and encouraged to discuss half of the eight values to arrive at a consensus estimate for the group.
The key finding was that the averages from these “deliberating crowds” of five were more accurate than those from an equal number of independent individuals. For instance, the average obtained from the estimates of four deliberating groups of five was significantly more accurate than the average obtained from 20 independent individuals. In fact, averaging four deliberating groups resulted in a more accurate estimate than averaging 1,400 individual estimates.
These benefits were not observed for the estimated values that were not discussed by the group, so they somehow derived from the group-level process itself. But what, exactly, were the groups doing to achieve this impressive effect?
In a follow-up study with 100 university students, the researchers tried to get a better sense of what the deliberating crowds actually did. Did they tend to go with the answers of those who were most confident about their estimates? Did they gravitate towards the answers of those least willing to change their minds? This happened some of the time, but it wasn’t the dominant response. Most frequently, the groups reported that they “shared arguments and reasoned together.” Somehow, these arguments and reasoning resulted in a global reduction in error, rather than introducing correlated errors that undermined the wisdom of crowds.
The new paper by Navajas and colleagues reports only two studies, one large and one small, and it focuses exclusively on estimates concerning trivia or general knowledge. As a result, many questions remain. But the potential implications for group decision-making and deliberation are enormous. If a small number of deliberating groups can outperform a much larger number of individuals, this suggests that procedures like “deliberative polling” could be a promising strategy for public and private communities to pursue.
Galton introduced his 1907 paper by noting that “[i]n these democratic days, any investigation into the trustworthiness and peculiarities of popular judgments is of interest.”
生词记录
statistician 统计学家
peculiar 奇怪的,独特的
country fair 美国每年夏季的县集市
estimate 估计,估算,估价
median 中位数
expertise 专长,专门技能
dressed weight 可用于食用的肉的量
merit 优点,价值,功绩
capitalize on sth 利用···获益;从···中获利;充分利用;借助于
cancel sth out 抵消
diverse 多种多样的
yield 产生;放弃,让出(常指被迫)
detriment 危害
twist 转变,转折
subdivide 细分
deliberate 深思熟虑;有意的,故意的
consensus 一致的意见,共识
derive 从···中得到
follow-up 后续行动,后续事物;后续的,进一步的
gravitate 吸引到;受吸引而转到
dominant 首要的,占支配地位的,显著的;显性的;优势的
reason 推理,判断(作动词)
trivia 琐事
implication 暗示,含意;可能的影响
outperform 超过,胜过;比···做的好
peculiarity 古怪,奇异,怪癖;特点