Wikipedia:Wikipedia Signpost/2017-06-23/Recent research

Recent research

Utopian bubbles: Can Wikipedians create value outside of the capitalist system?

By Dorothy Howard, Simon Razniewski and Tilman Bayer

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Arwid Lund

"Wikipedia, work, and capitalism. A realm of freedom?"

Review by Dorothy Howard

In his first book, Wikipedia, Work, and Capitalism. A Realm of Freedom?,^[1] Arwid Lund, lecturer in the program of Information Studies (ALM: Archives, Libraries and Museums) at Uppsala Universitet, Sweden investigates the ideologies that he believes are shared by participants in peer-production projects like Wikipedia. The author typologizes the ways that Wikipedians understand their activities, including “playing v. gaming” and “working v. labouring,” (113-115) to explore his hypothesis that “there is a link between how Wikipedians look upon their activities and how they look upon capitalism.” (117) Lund characterizes peer-production projects by their shared resistance to information capitalism—things like copyright and pay-walled publishing, which they see as limiting creativity and innovation. His thesis is provocative. He claims that the anti-corporatist ideologies intrinsic to peer production and to Wikipedia are unrealistic because capitalism always finds a way to monetize free content. Overall, the book touches on many issues not usually discussed within the Wikipedia community, but which might be a useful entry point for those who want to consider the social impacts of the project.

Lund uses a combination of social critique and qualitative interviews conducted in 2012 to provide supporting evidence for his thesis. One recurrent theme is that Wikipedia is part of a larger trend in gamification—a design technique developed in Human–computer interaction (HCI) to describe the process of using features associated with "play" to motivate interaction and engagement with an interface. One example he gives is that editors report that they find Wikipedia's competitive and confrontational elements to be game-like. (143-144) He also claims that Wikipedians' descriptions of their work and play balance changes as they take on more levels of responsibility and professionalism in the community, such as adminship. Still, it’s highly questionable whether the 8 interviews, which mainly focus on the Swedish Wikipedia, are a sufficient sample size to make his claims scalable.

The culture of Wikipedia valorizes altruism in its embrace of volunteering for the project to produce information for the greater good. Lund argues that Wikipedians' belief in the altruistic aspect of the project, makes it easy for them to depoliticize their work and to ignore the how Wikipedia participates in the corporate, information economy. To him, Wikipedia is symptomatic of the devaluation of digital work, when in past generations, making an encyclopedia might be a source of income and employment opportunities for contributors.

So, he argues, contributors believe that peer production represents a space of increased autonomy, democracy, and creativity in the production of ideas. But from his view, attempts at a “counter-economy,” “hacker communism,” or “gift economies” (239, 303) are prone to manipulation, because we can’t create utopian bubbles within capitalism that aren’t privy to its influence. Still, peer production projects operate as if creation of value outside of the capitalist system is possible. Lund argues that Wikipedia cannot avoid competition with proprietary companies which see Wikipedia as a threat, and have an interest in harvesting its content for their own benefit. (218) Yet it would be nice if he brought in more examples to make this claim. The reader is left wondering who these corporate interests are, and what exactly they derive from Wikipedia. Having this information would help us understand where Lund is coming from.

Marxist linguist V.N. Volosinov, one of the references for Lund's analysis

Although the word “work” in the title might suggest that Lund focuses on wage labour, the author’s aims are more broad, and he uses the word to connote a variety of aspects of social, value-producing activities. (20) Namely, the production of “use-value,” the Marxist term for the productive social activity of creating things which are deemed useful and thus of value to be bought and sold in the market (even if producers don’t consider their work to be commodities). He draws from Marxist thinkers and semioticians, among them V.N. Volosinov, Terry Eagleton, and Louis Althusser, to unpack different approaches to describing why Wikipedians might feel like they are playing when they are really working. (107-108) Marxists call such assumptions “false consciousness,” but the concept is difficult because it requires us to analyze manifest and latent (discursive and non-discursive) awareness. It would have been useful for Lund to look at how the fields of anthropology or psychology talk about ideology. Both fields have extensively researched the topic. More stringent ethnographic or qualitative methods might have also made his argument more convincing. But, based on the references he provides, it seems that the book's _target audience may be media theorists and social scientists, people who already familiar with Marxist political economy.

Lund makes a compelling case that capitalism instrumentalizes freely-produced knowledge for its own monetary gains. Meanwhile, he says, Wikipedia's design and its heavily ideological agenda, make it difficult for the community to address the issue. The book is an interesting contribution to ongoing conversations about how Wikipedia and projects motivated by copyleft principles can be defined from a social perspective.

How does unemployment affect reading and editing Wikipedia ? The impact of the Great Recession

Review by Tilman Bayer

A discussion paper titled "Economic Downturn and Volunteering: Do Economic Crises Affect Content Generation on Wikipedia?"^[2] investigates how "drastically increased unemployment" affects contribution to and readership of Wikipedia. To study this question statistically, the authors (three economists from the Centre for European Economic Research (ZEW) in Mannheim, Germany) regarded the Great Recession that began in 2008 as an "exogeneous shock" that affected unemployment rates in different European countries differently and at different times. They relate these rates to five metrics for the language version of Wikipedia that corresponds to each country:

"(1) aggregate views per month, (2) the number of active Wikipedians with a modest number of monthly edits ranging from 5 to 100, (3) the number of active Wikipedians with more than 100 monthly edits, (4) edits per article, and (5) the content growth of a corresponding language edition of Wikipedia in terms of words"

For each of these, the Wikimedia Foundation publishes monthly numbers. Since the researchers did not have access to country-level breakdowns of this data (which is not published for every country/language combination due to privacy reasons, except for some monthly or quarterly overviews which the authors may have overlooked, but only start in 2009 anyway), "to study the relationship of country level unemployment on an entire Wikipedia, we need to focus on countries which have an (ideally) unique language". This excluded some of the European countries that were most heavily affected by the 2008 crisis, e.g. the UK, Spain or Portugal, but still left them with 22 different language versions of Wikipedia to study.

An additional analysis focuses on district-level (Kreise) employment data from Germany and the German Wikipedia, respectively. None of the five metrics are available with that geographical resolution, so the authors resorted to the geolocation data for the (public) IP addresses of anonymous edits (which for several large German ISPs is usually more precise than in many other countries).

In both parts of the analysis, the economic data is related to the Wikipedia participation metrics using a relatively simple statistical approach (difference in differences), whose robustness is however vetted using various means. Still, since in some cases the comparison only included 9 months before and after the start of the crisis (instead of an entire year or several years), this leaves open the question of seasonality (e.g. it is well-known that Wikipedia pageviews are generally down in the summer, possibly due to factors like vacationing that might differ depending on the economic situation).

Summarizing their results, the authors write:

"we find that increased unemployment is associated with higher participation of volunteers in Wikipedia and an increased rate of content generation. With higher unemployment, articles are read more frequently and the number of highly active users increases, suggesting that existing editors also increase their activity. Moreover, we find robust evidence that the number of edits per article increases, and slightly weaker support for an increased overall content growth. We find the overall effect to be rather positive than negative, which is reassuring news if the encyclopedia functions as an important knowledge base for the economy."

While leaving open the precise mechanism of these effects, the researchers speculate that "it seems that new editors begin to acquire new capabilities and devote their time to producing public goods. While we observe overall content growth, we could not find robust evidence for an increase in the number of new articles per day [...]. This suggests that the increased participation is focused on adding to the existing knowledge, rather than providing new topics or pages. Doing so requires less experience than creating new articles, which may be interpreted as a sign of learning by the new contributors."

The paper also includes an informative literature review summarizing interesting research results on unemployment, leisure time and volunteering in general. (For example, that "conditional on having Internet access, poorer people spend more time online than wealthy people as they have a lower opportunity cost of time." Also some gender-specific results that, combined with Wikipedia's well-known gender gap, might have suggested a negative effect of rising unemployment on editing activity: "Among men, working more hours is even positively correlated with participation in volunteering" and on the other hand "unemployment has a negative effect on men’s volunteering, which is not the case for women.")

It has long been observed how Wikipedia relies on the leisure time of educated people, in particular by Clay Shirky, who coined the term "cognitive surplus" for it, the title of his 2010 book. The present study provides important insights into a particular aspect of this (although the authors caution that economic crises do not uniformly increase spare time, e.g. "employed people may face larger pressure in their paid job", reducing their available time for editing Wikipedia). The paper might have benefited from including a look at the available demographic data about the life situations of Wikipedia editors (e.g. in the 2012 Wikipedia Editor survey, 60% of respondents were working full-time or part-time, and 39% were school or university students, with some overlap).

Briefly

How complete are Wikidata entries?

Author's summary by Simon Razniewski

While human-created knowledge bases (KBs) such as Wikidata provide usually high-quality data (precision), it is generally hard to understand their completeness. A conference paper titled "Assessing the Completeness of Entities in Knowledge Bases"^[3] proposes to assess the relative completeness of entities in knowledge bases, based on comparing the extent of information with other similar entities. It outlines building blocks of this approach, and present a prototypical implementation, which is available on Wikidata as Recoin (https://www.wikidata.org/wiki/User:Ls1g/Recoin).

"Cardinal Virtues: Extracting Relation Cardinalities from Text"

Author's summary by Simon Razniewski

Information extraction (IE) from text has largely focused on relations between individual entities, such as who has won which award. However, some facts are never fully mentioned, and no IE method has perfect recall. Thus, it is beneficial to also tap contents about the cardinalities of these relations, for example, how many awards someone has won. This paper^[4] introduces this novel problem of extracting cardinalities and discusses the specific challenges that set it apart from standard IE. It present a distant supervision method using conditional random fields. A preliminary evaluation that compares information extracted from Wikipedia with that available on Wikidata shows a precision between 3% and 55%, depending on the difficulty of relations.

Conferences and events

See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.

Compiled by Tilman Bayer

"Learning by comparing with Wikipedia: the value to students’ learning"^[5] From the paper: "The main purpose of this research work is to describe and evaluate a learning technique that actively uses Wikipedia in an online master’s degree course in Statistics. It is based on the comparison between Wikipedia content and standard academic learning materials. We define this technique as ‘learning by comparing’. [...] The main result of the paper shows that [...] active use of Wikipedia in the learning process, through the learning-by-comparing technique, improves the students’ academic performance. [...] The main findings on the students’ perceived quality of Wikipedia indicate that they agree with the idea that the encyclopaedia is complete, reliable, current and useful. Although there is a positive perception of quality, there are some quality factors that obtain better scores than others. The most valued quality aspect was the currentness of the content, and the least valued was its completeness."
"Use and awareness of Wikipedia among the M.C.A students of C. D. Jain college of commerce, Shrirampur : A Study"^[6]
"Comparative assessment of three quality frameworks for statistics derived from big data: the cases of Wikipedia page views and Automatic Identification Systems"^[7] From the abstract: " We apply these three quality frameworks in the context of 'experimental' cultural statistics based on Wikipedia page views"
"Discovery and efficient reuse of technology pictures using Wikimedia infrastructures. A proposal"^[8] From the abstract: "With our proposal, we hope to serve a broad audience which looks up a scientific or technical term in a web search portal first. Until now, this audience has little chance to find an openly accessible and reusable image narrowly matching their search term on first try .."
"Extracting scientists from Wikipedia"^[9] From the abstract: "... we describe a system that gathers information from Wikipedia articles and existing data from Wikidata, which is then combined and put in a searchable database. This system is dedicated to making the process of finding scientists both quicker and easier."
"Where the streets have known names"^[10] From the abstract: "We present (1) a technique to establish a correspondence between street names and the entities that they refer to. The method is based on Wikidata, a knowledge base derived from Wikipedia. The accuracy of this mapping is evaluated on a sample of streets in Rome. As this approach reaches limited coverage, we propose to tap local knowledge with (2) a simple web platform. ... As a result, we design (3) an enriched OpenStreetMap web map where each street name can be explored in terms of the properties of its associated entity."

References

^ Lund, Arwid (2017). Wikipedia, Work, and Capitalism. Springer: Dynamics of Virtual Work. ISBN 9783319506890.
^ Kummer, Michael E.; Slivko, Olga; Zhang, Xiaoquan (Michael) (2015-11-01). Economic Downturn and Volunteering: Do Economic Crises Affect Content Generation on Wikipedia?. Rochester, NY: Social Science Research Network.
^ Ahmeti, Albin; Razniewski, Simon; Polleres, Axel (2017). Assessing the Completeness of Entities in Knowledge Bases. ESWC.
^ Mirza, Paramita; Razniewski, Simon; Darari, Fariz; Weikum, Gerhard (2017). Cardinal Virtues: Extracting Relation Cardinalities from Text. ACL.
^ Meseguer-Artola, Antoni (2014-05-26). "Aprenent mitjançant la comparació amb la Wikipedia: la seva importància en l'aprenentatge dels estudiants". RUSC. Universities and Knowledge Society Journal. 11 (2): 57–69. doi:10.7238/rusc.v11i2.2042. ISSN 1698-580X. ("Learning by comparing with Wikipedia: the value to students’ learning", in English with Catalan abstract)
^ Pathade, Prasad R. "Use and awareness of Wikipedia among the M.C.A students of C. D. Jain college of commerce, Shrirampur : A Study" (PDF). International Multidisciplinary e-Journal. ISSN 2277-4262.
^ Reis, Fernando; di Consiglio, Loredana; Kovachev, Bogomil; Wirthmann, Albrecht; Skaliotis, Michail (June 2016). Comparative assessment of three quality frameworks for statistics derived from big data: the cases of Wikipedia page views and Automatic Identification Systems (PDF). European Conference on Quality in Official Statistics (Q2016). Madrid. p. 16.
^ Heller; Blümel; Cartellieri; Wartena. "Discovery and efficient reuse of technology pictures using Wikimedia infrastructures. A proposal". Zenodo. doi:10.5281/zenodo.51562.
^ Ekenstierna, Gustaf Harari; Lam, Victor Shu-Ming (2016). "Extracting Scientists from Wikipedia". From Digitization to Knowledge 2016. p. 8.
^ Almeida, Paulo Dias; Rocha, Jorge Gustavo; Ballatore, Andrea; Zipf, Alexander (2016-07-04). "Where the Streets Have Known Names". In Osvaldo Gervasi; Beniamino Murgante; Sanjay Misra; Ana Maria A. C. Rocha; Carmelo M. Torre; David Taniar; Bernady O. Apduhan; Elena Stankova; Shangguang Wang (eds.). Computational Science and Its Applications -- ICCSA 2016. Lecture Notes in Computer Science. Springer International Publishing. pp. 1–12. ISBN 9783319420882.

← Previous "Recent research"

Next "Recent research" →

In this issue

23 June 2017 (all comments)

News and notes

In the media

Op-ed

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Nicely done. --_{Piotr Konieczny aka Prokonsul Piotrus| reply here} 07:37, 23 June 2017 (UTC)[reply]

Indeed! Great reviews, thank you very much. (And I'm glad there is someone reading the Marxist theoretical approach to Wikipedia, because I'd rather not have to do it myself.... ;) ) The Land (talk) 15:55, 23 June 2017 (UTC)[reply]

Wikipedia, Work, and Capitalism is a 350 page book that I have not finished reading so I cannot speak to all of it, but Dorothy's summary of the premise is fair enough. I do worry about commercial interests exploiting the goodwill of Wikimedia contributors. At the same time, I also feel like Wikimedia projects are significant beyond just the content that they host. Among the top 500 websites as ranked by Alexa Internet, there are only a few noncommercial organizations present (checked October 2016): 5. Wikipedia 99. BBC (US), 105. BBC (UK), 137. The Pirate Bay, 192. Mozilla, 227. Internet Archive, 234. National Institutes of Health, 423. United States Postal Service. Among those, only Wikipedia and maybe Internet Archive are seeking to provide a community space for everyone to participate in noncommercial service for everyone else. I feel like Wikipedia might be the last public space that the world might know for some generations, because in every other non-niche outlet there are commercial interests grabbing for any scrap of data or attention that they can extract from anyone. If public media ever mattered then Wikimedia projects are preserving that community space and trying to stake claims to keep it for the future. If it is ever lost then maybe it could be gone for a long time. It is unfortunate that so much Wikimedia project success depends on the labor of so few volunteers who give so much and get use for their research, writing, photography and the rest by commercial organizations who take but do not give back their fair share or social due. Blue Rasberry (talk) 19:50, 23 June 2017 (UTC)[reply]
Rather interesting research. Thanks for pointing us to the paper by Kummer et al. It is hard to believe that is took more than a year and a half until it has reached the community. — You might like to know that Günter Schuler as early as 2007 in his book Wikipedia inside (207ff, 214ff) argued that the blossoming of Web 2.0 which Wikipedia is a part of was after all mostly due to the wave of neoliberalism which had increased an interest in "sense" and idealistic values vs. economical effectiveness, metrics etc., fuelling the influx of editors to blogs, and wikis in the period 2000—2005. So there seems to be a constant point it the Zeitreichen, or those rich in time and leasure, mostly keep editing the web.--Aschmidt (talk) 22:12, 23 June 2017 (UTC)[reply]

A minor comment: The authors are not three economists from the Centre for European Economic Research (ZEW) in Mannheim, Germany, but rather: Michael Kummer (from Georgia Institute of Technology), Olga Slivko(an economist from ZEW), and Michael Zhang (Hong Kong University of Science and Technology). — Preceding unsigned comment added by 193.196.11.188 (talk) 08:44, 26 June 2017 (UTC)[reply]

Always great to have new research on Wikipedia.

"Wikipedia, work, and capitalism. A realm of freedom?"

He claims that the anti-corporatist ideologies intrinsic to peer production and to Wikipedia are unrealistic because capitalism always finds a way to monetize free content.

I'd disagree for two reasons: first off who says that such anti-corporatist ideologies necessarily have a problem with capitalism monetizing free content? For instance I don't have a problem with companies making profits from my any of my contributions online even without attribution (as long as my contributions themselves remain open in such a way that they could theoretically also be used for non-profit projects). Also companies monetizing free content can be constructive for society as well and within the current structure in many cases it might not be feasible (due to time/resources required) for non-profit projects to put such content to use in the same way.

Secondly capitalism will simply be overcome. I'm stating this that confidently not only because our socioeconomic system has changed many times in human history but because everything points to our civilization collapsing / we as a species actually dying out if we don't manage to change our socioeconomic structures within the next few decades in ways that are so fundamental as to warrant a new name for the succeeding model. And imo Wikipedia is part of the paradigm that is already overturning capitalism from within and which already employs the logic of a (imo that) new model.

One recurrent theme is that Wikipedia is part of a larger trend in gamification—a design technique developed in Human–computer interaction (HCI) to describe the process of using features associated with "play" to motivate interaction and engagement with an interface. One example he gives is that editors report that they find Wikipedia's competitive and confrontational elements to be game-like.

I don't think interactions with and on Wikipedia could largely be described as applying gamification. Gamification is the active application of typical game playing elements such as point scoring, awards and competitions to non-game / work-like processes. None of that is really done on Wikipedia - there are edit counts, thank yous and barnstars etc but most don't pay attention to them / use them and mostly (and most importantly) they're not actively applied to motivate engagement. There is some low-profile, marginal gamification going on on Wikipedia - but describing Wikipedia as being "part of a larger trend in gamification" would imo be plain out wrong.
Even less prevalent examples of gamification on Wikipedia would be: CitationHunt and The WikiData game and two suggested gamification elements would be: an auto-congratulatory feature and Edit counts of subject-area editors / WikiProject leaderboards.

Still, it’s highly questionable whether the 8 interviews, which mainly focus on the Swedish Wikipedia, are a sufficient sample size to make his claims scalable.

Good remark. Such a type of interviews is an outdated mode of gaining insights / feedback and I don't think they are sufficient.

To him, Wikipedia is symptomatic of the devaluation of digital work, when in past generations, making an encyclopedia might be a source of income and employment opportunities for contributors.

What a strange way to look at this. It does not devalue digital work. Actually it values knowledge so highly that it has to be disassociated from monetary incentives. Societally highly constructive work such as contributions to open knowledge, open data and online encyclopedias should actually be valued by society so that people who engage in such can allocate their full time to it. But it isn't. Which means society should to change. And with that − at least in the case of Wikipedia − I'm less referring to the mentality of people but to the societal structures that organize people's self-sustenance and work-allocation (such as the kinds of "monetary income" and "employment" we know of thus far).

But from his view, attempts at a “counter-economy,” “hacker communism,” or “gift economies” (239, 303) are prone to manipulation, because we can’t create utopian bubbles within capitalism that aren’t privy to its influence.

That's one reason why projects such as Wikipedia need to remain highly adaptive and find proper mechanisms and measures to thwart such "manipulations". I don't think Wikipedia has been manipulated at its core even though its surrounding capitalism causes e.g. marketers to attempt to use it in the most beneficial way.

why Wikipedians might feel like they are playing when they are really working

I don't think that Wikipedians in general or in large numbers "feel like they are playing when they are really working". I think it's a new mode of engagement that is not accurately described as either. Something akin to prosumptive, interactive learning/contributing. To me it's more fun than playing because you will actually achieve sth / have an impact / be constructive in a meaningful way.

Lund makes a compelling case that capitalism instrumentalizes freely-produced knowledge for its own monetary gains. Meanwhile, he says, Wikipedia's design and its heavily ideological agenda, make it difficult for the community to address the issue.

Did he also explain why that would be an issue at all?

How does unemployment affect reading and editing Wikipedia ? The impact of the Great Recession

Concerning the remarks of what data the researcher could also have used: it would be a good idea to have researchers announce (the subject of) their research somewhere so that people can help by providing relevant data, crowdsource information and provide technical support etc. This should probably be done for all kinds of research (in streamlined manners) but let's start with research on Wikipedia.

And as a relevant sidenote I've become jobless recently and you'll likely to see my contributions spike (once I got some other issues fixed) as I now have more time for Wikipedia (& FOSS) which I consider way more constructive for society and self-actualizing than work within the market economy, generating profits etc. Sadly I won't be able to sustain this situation (of joblessness) for long. Note that imo that doesn't really affect what I stated in my comments about the book above - I enjoy being jobless (as long as I can also safely meet my basic needs etc; won't be able to sustain it for long but maybe I can establish a similar situation in a year or so) and I'm not frustrated with the current system but think it's outdated.

"Extracting scientists from Wikipedia"

There should be a global IT system which in effect maps people to their expertise and skills which can then be effectively made use of by those in need for such, aiming to collaborate or exchange or engage in relevant projects. I don't think Wikipedia is very efficient in such as it only features a fraction of notable scientists. It needs a new project / website to do this (which might start off with data gained from Wikipedia).

"Where the streets have known names"

This might be relevant to Wikipedia:Smart city uses of Wikipedia.

–

And as a sidenote, relevant to my comments above, it imo also needs a proper IT system which indexes research and allows feedback and continuation etc beyond other academic research and the press simply picking up/referencing it.

--Fixuture (talk) 22:38, 28 June 2017 (UTC)[reply]

@Fixuture: I agree that Wikipedia isn't gamified in the sense that one might expect (compared to for instance Stack Overflow), but as you mention there are places where it pops up. In addition to the ones you point to, there's the Wikipedia Adventure, which has recently been published at a research conference (see this blog post for more info). Secondly, there's the WikiCup, which I studied in our 2015 CSCW paper about quality improvement projects (PDF here) and found some behaviour that was very similar to what you see when it comes to badges in Stack Overflow: participants get points for Good Articles, but nothing for B-class quality, so they rarely (if ever) stop at the latter. Given that these game-like elements pop up in several places, I'm not surprised that participants would describe Wikipedia in ways that can be interpreted as the site being "game-like". Cheers, Nettrom (talk) 14:30, 30 June 2017 (UTC)[reply]

What do you think of The Signpost? Share your feedback.

Home

About