登录以后才能看到帖子详情哦!
您需要 登录 才可以下载或查看,没有账号?立即注册
×
Crowds are wise enough to know when other people willget it wrong “群众智慧”让你有预知别人错误的超能力 关键词:“群众智慧”、 “未曾预料到会如此普遍的答案”、“偏差值”
Unexpected yet popular answersoften turn out to be correct. 正确的答案往往来自于未曾预料到如此普遍的答案。 凯瑟琳·奥格雷迪 2017年1月28日 下午2:19
The “wisdom of the crowd” is asimple approach that can be surprisingly effective at finding the correctanswer to certain problems. For instance, if a large group of people is askedto estimate the number of jelly beans in a jar, the average of all the answers gets closer tothe truth than individual responses. Thealgorithm is applicable to limited types of questions, but there’s evidence ofreal-world usefulness, like improvingmedical diagnoses. 在寻求某些问题的正解时,简单的 “群众智慧”却能带来令人惊讶的成效。举个例子,如果要求一大群人来估测罐子里软糖豆的数量,所有答案的平均数会比单独个人的回答更接近实际数值。这种方法虽然应用范围有限,但在实际生活中也有其有效的证据,比如提升医学诊断水平。
This process has some prettyobvious limits, but a team of researchers at MIT and Princeton published apaper in Nature thisweek suggesting a way to make it more reliable: look for an answer thatcomes up more often than people think it will, and it’s likely to be correct. 尽管这种推断过程存在一些非常明显的缺陷,但麻省理工学院和普林斯顿大学的研究小组成员本周在《自然》杂志发表了一篇文章,提出了一种使其更可靠的方法:那些出现频率超出原先预计的答案,往往都会是对的。
As part of their paper, DraženPrelec and his colleagues used a survey on capital cities in the US. Eachquestion was a simple True/False statement with the format “Philadelphia is thecapital of Pennsylvania.” The city listed was always the most populous city inthe state, but that's not necessarily the capital. In the case of Pennsylvania,the capital is actually Harrisburg, but plenty of people don’t know that. 文章中提到,德拉赞•普雷莱克以及他的同事们在美国的几座首府城市进行了调查。所用的问题都是对类似于“宾夕法尼亚州的首府城市是费城”这样的陈述句进行简单的正误判断。题目中列举的城市都是该州人口最多的城市,但却不一定是首府。拿宾州的例子来说,它的首府其实是哈里斯堡,然而很多人并不知道这一点。
The wisdom of crowdsapproach fails this question. The problem is that questions sometimes relyon people having unusual or otherwise specialized knowledge that isn’t sharedby a majority of people. Because most people don’t have that knowledge, thecrowd’s answer will be resoundingly wrong. Previous tweaks have tried tocorrect for this problem by taking confidence into account. People are askedhow confident they are in their answers, and higher weight is given to more confident answers.However, this only works if people are aware that they don’t know something—andthis is often strikingly not the case. 群众智慧无法解决这个问题。论其原因是因为有时问题需要人们有不寻常或者专业的、非大众化的知识背景来回答。由于大部分人对这些方面并不了解,群体答案就出现了非常离谱的错误。前文中的调查之后还进行了略微调整,尝试把信心程度纳入考量来看能否纠正存在的偏差。人们会被问到他们对自己的回答有多少信心,那些更被确信的答案在最后总结时所占的比重也会相应更高。然而,这种方式只有在人们清楚自己缺乏相关知识的情况下才有效果——但显然事实并非如此。
In the case of thePhiladelphia question, people who incorrectly answered “True” were about asconfident in their answers as people who correctly answered “False,” soconfidence ratings didn’t improve the algorithm. But when people were asked topredict what they thought the overall answer would be, there was a differencebetween the two groups: people who answered “True” thought most people wouldagree with them, because they didn’t know they were wrong. The people whoanswered “False,” by contrast, knew they had unique knowledge and correctlyassumed that most people would answer incorrectly, predicting that most peoplewould answer “True.” 在上面所说的费城问题里,那些给出了错误答案“是”的人对自己答案的确信程度和回答正确给出“否”答案的人数目不分上下,因此加入自信度来计量并没有起到什么作用。但当人们被要求去估测最终统计后的答案会是什么的时候,两组之间出现了分歧:回答“是”的人认为大部分人都会和他们一样,因为他们并不知道他们的回答是错误的。而相反地,回答“不是”的人由于了解自己知道一些人所不知,因而正确地做出了大部分人会回答错误的假设,预测一般情况下人们会回答“是”。
Because of this, the group atlarge predicted that “True” would be the overwhelmingly popular answer. And itwas—but not to the extent that they predicted. More people knew it was a trickquestion than the crowd expected. That discrepancy is what allows theapproach to be tweaked. The new version looks at how people predictthe population will vote, looks for the answer that people gave more often thanthose predictions would suggest, and then picks that “surprisingly popular” answeras the correct one. 正是由于这个原因,这个组里大部分的成员都猜测回答“是”的人数将具备压倒性。而事实也的确如此——但也没有他们预测的那么多。出乎意料的是很多人知道这个问题有误导性。这之间的差异使得相关调查研究稍稍有了些突破。新的调查版本是先去了解人们会怎样预测大众投票的结果,之后去寻找那些人们通常会给出的、超出了之前预计结果的答案,然后把这个“异常得宠”的回答作为正答。
To go back to our example:most people will think others will pick Philadelphia, while very few willexpect others to name Harrisburg. But, because Harrisburg is the right answer,it'll come up much more often than the predictions would suggest. 再回到我们之前的例子:大多数人会认为其他人都会选择费城,而只有很小的一部分人期待其他人选择哈里斯堡。但是,因为正确答案是哈里斯堡,所以它出现的实际次数将会比预计的次数要多。
Prelec and his colleaguesconstructed a statistical theorem suggesting that this process would improve matters and then tested it on a number ofreal-world examples. In addition to the state capitals survey, they used ageneral knowledge survey, a questionnaire asking art professionals andlaypeople to assess the prices of certain artworks, and a survey askingdermatologists to assess whether skin lesions were malignant or benign. 于是普雷莱克等人得出了一个统计学上的定理,认为这个预测的过程将会使现有情况有所改进,并在一系列实际生活中的范例上对这个理论进行了测试。除了之前的关于州首府的调查以外,他们还分别进行了一次综合知识调查,一次关于艺术从业者以及外行人对特定艺术展票价认知的问卷调查,和一次针对皮肤学专家对皮肤损伤恶性与否的诊断调查。
Across the aggregated resultsfrom all of these surveys, the “surprisingly popular” (SP) algorithm had 21.3percent fewer errors than a standard “popular vote” approach. In 290 ofthe 490 questions across all the surveys, they also assessed people’sconfidence in their answers. The SP algorithm did better here, too: it had 24.2percent fewer errors than an algorithm that chose confidence-weighted answers. 从以上几次经验总结得出,这种“异常得宠”(SP)的测算方法相比于一般意义上的“大众投票”而言,出错概率要低百分之二十一点三。他们还在所有调查共计490道题中的290道上进行了信心测试。SP算法同样处于优势:出错率相对于按照确信度得出的答案要低百分之二十四点二。
It’s easy to misinterpret the“wisdom of crowds” approach as suggesting that any answer reached by alarge group of people will be the correct one. That’s not the case; it canpretty easily be undermined by social influences, like being told how otherpeople had answered. These failings are a problem, because it could be areally useful tool, as demonstrated by its hypothetical uses in medicalsettings. “群众智慧”很容易被误解为大多数人的回答就是正确答案,但其实并不是这么回事;它很容易受到社会上的影响从而使得准确性降低,比如说回答者被告知其他人的答案是什么。这种失误的确亟需解决,因为这种方法本身具备有效可用性,从它被用于在医学问题假设这一方面就可见一斑。
Improvements like these, then,contribute to sharpening the tool to the point where it could have robustreal-world applications. “It would be hard to trust a method if it fails withideal respondents on simple problems like [the capital of Pennsylvania],” theauthors write. Fixing it so that it gets simple questions like these right is abig step in the right direction. 而类似的这些改进,应当着力于方法的完善,好让它可以在实际运用中站稳脚跟。“如果某种方法在类似于‘宾州的首府’这种简单的理想化调查问题上都失败了的话,那它就很难让人信服。”作者这样写到。因此不断调整这种测算方法来让它可以克服这一类问题,则会是在正确航向上迈出的巨大一步。
《自然》,2016. 翻译 by Amel 校对 by 晨晨 终校 by Gabriellaz 树屋字幕组-文翻组 翻译仅供学习交流,严禁用于商业用途
|