“计算”的边界：互联网大数据与社会研究/中南大学学报--社会科学版-免费论文下载/阅读

中南大学学报(社会科学版)
ZHONGNAN DAXUE XUEBAO(SHEHUI KEXUE BAN)

2018年03月第24卷第2期


本文已被:浏览2105次下载1043次	[PDF全文下载]


文章编号：1672-3104(2018)02-0148-11

“计算”的边界：互联网大数据与社会研究

郝龙

（武汉大学社会学系，湖北武汉，430072）

摘要：互联网大数据计算，是当前社会研究方法创新的主要方向之一。部分纯数据驱动型学者认为，大数据独立于研究之外生成，不仅能记录下人们的真实态度与自然行为信息，又可以摆脱研究者与研究本身的干扰，由此形成了“总体性”“真实-自然性”与“客观性”三大认识假定。然而，无论是由数字鸿沟造就的年龄与阶层边界和由差异化生产划定的群体与主题边界，还是由数据操纵和数据引导带来的虚假(非真实)与偏态(非自然)状况，以及潜藏在整个数据生产-挖掘-分析过程中的人为干扰，都证明上述假定在很多情况下并不成立。认清互联网大数据的可“计算”边界，对于推动数据计算在社会研究中的应用有着重要的理论与方法意义。

关键词：互联网；大数据；计算范式；数据缺失；数据偏态；数据操纵



The boundary of computation: Internet big data and social survey

HAO Long

（Department of Sociology, Wuhan University, Wuhan 430072, China）

Abstract: The computation of internet big data is one of the main directions of innovation in current social survey methods. Some purely data-driven scholars believe that independently generated big data can not only record information about people's real attitudes and natural behaviors, but also get rid of the interference by researchers and the research itself. Therefore, big data is considered to be “general” “real-natural” and “objective”. However, whether the age-class boundaries created by the digital divide and the boundaries of groups and topics delineated by differentiated production, or the false (non-real) and skewed (unnatural) conditions brought about by data manipulation and data guidance, or human interference in the process of data mining-mining-analysis, they all prove that the above assumption does not hold in many cases. Therefore, recognizing the computable boundary of internet big data has important theoretical and methodological significance for promoting the application of data computing in social survey.

Key words: internet; big data; computational paradigm; missing data; skewed data; manipulated data

地址：湖南省长沙市岳麓区麓山南路932号邮编： 410083

电话： 0731-88830141