Wikipedia:Wikipedia Signpost/2023-02-20/In the media

In the media

Arbitrators open case after article alleges Wikipedia "intentionally distorts" Holocaust coverage

Wikipedians rebut paper alleging "intentional distortion" of Holocaust history

Jan Grabowski of the University of Ottawa, one of the two authors of the paper

An essay published on 9 February 2023 in The Journal of Holocaust Research and reported on by Haaretz and Polish daily Wyborcza, as well as San Diego Jewish World and Ynet, alleges that Wikipedia engages in "intentional distortion of the history of the Holocaust". The abstract of the essay, written by Jan Grabowski of the Department of History at the University of Ottawa and Shira Klein of the Department of History at Chapman University in Orange, California, says:

This essay uncovers the systematic, intentional distortion of Holocaust history on the English-language Wikipedia, the world’s largest encyclopedia. In the last decade, a group of committed Wikipedia editors have been promoting a skewed version of history on Wikipedia, one touted by right-wing Polish nationalists, which whitewashes the role of Polish society in the Holocaust and bolsters stereotypes about Jews. Due to this group's zealous handiwork, Wikipedia's articles on the Holocaust in Poland minimize Polish antisemitism, exaggerate the Poles' role in saving Jews, insinuate that most Jews supported Communism and conspired with Communists to betray Poles (Żydokomuna or Judeo–Bolshevism), blame Jews for their own persecution, and inflate Jewish collaboration with the Nazis. To explain how distortionist editors have succeeded in imposing this narrative, despite the efforts of opposing editors to correct it, we employ an innovative methodology. We examine 25 public-facing Wikipedia articles and nearly 300 of Wikipedia’s back pages, including talk pages, noticeboards, and arbitration cases. We complement these with interviews of editors in the field and statistical data gleaned through Wikipedia's tool suites. This essay contributes to the study of Holocaust memory, revealing the digital mechanisms by which ideological zeal, prejudice, and bias trump reason and historical accuracy. More broadly, we break new ground in the field of the digital humanities, modelling an in-depth examination of how Wikipedia editors negotiate and manufacture information for the rest of the world to consume.

On 13 February 2023, Wikipedia's primary disciplinary body, the Arbitration Committee, took the unusual step of initiating a case request sua sponte in response to the essay, "invoking its jurisdiction over all matters previously heard and exercising its authority to revisit any proceeding at any time at its sole discretion." The topic area – including many of the edits and behaviours discussed by Grabowski and Klein – has been the subject of multiple arbitration proceedings before, from the 2009 Eastern European mailing list case to the 2019 Antisemitism in Poland case.

On 15 February 2023, Wyborcza (the Polish newspaper that carried Grabowski and Klein's summary of their essay) published a rebuttal by Piotr Konieczny of the Department of Media and Social Informatics at Hanyang University, one of the Wikipedians (User:Piotrus) named by Grabowski and Klein. Piotr said the essay contained many assertions of fact that were not borne out by edit histories recorded on Wikipedia, as well as instances of selective quoting. For those who don't subscribe to Wyborcza – the paper is paywalled – the text of the rebuttal is available here. A longer, English-language response by Piotr raising some of the same issues is here. Volunteer Marek, another editor named in the essay, has also published a multi-part response in English on his Substack.

Watch out for an independent review of the paper in the upcoming issue of the Signpost's monthly "Recent research" section. In the meantime, see also previous Signpost coverage of similar complaints raised in 2019. – AK

"Why we should be wary of Wikipedia"

Placeholder alt text
Investigative journalist Russ Baker

This is the premise of an article series investigative journalist Russ Baker kicked off on 6 February 2023 with a piece on his whowhatwhy.org website.

Baker is a veteran reporter who has written for top publications like The New Yorker and The Washington Post. He has tussled with the Church of Scientology. In 2005, he won the Deadline Club award for his exclusive reporting on George W. Bush's military record. Baker was among the first to cast doubt on Colin Powell's now-infamous presentation on Iraq at the United Nations – at the time a very unpopular stance – and among the first to make Americans aware of the impending genocide in Rwanda. But presently, he is concerned about Wikipedia's biographies.

Baker notes that discussions of bias on Wikipedia have generally focused on its alleged "white, American" bias as well as its alleged "leftist" bias:

But none of these critiques really get at what I'm talking about: how professional or amateur "hit men" can infiltrate Wikipedia and go after individuals and ruin them in the public eye.

Some years ago, when I began researching this, I found very little online about this phenomenon, despite the fact that I knew a fair number of individuals who had been victims of the practice. Now that I look again, I still see no sign that this problem is being addressed or even vestigially discussed.

What this means is, nobody is minding the store to make sure that we don’t end up in some type of artificial informational construct that edits the facts about powerful actors and institutions to conform to a subjective agenda instead of reality.

Baker feels that anti-establishment actors like himself are particularly likely to suffer, given that Wikipedia uses the mainstream media as its arbiter, and these media are in many ways an integral part of the establishment.

The fact is, anyone who is out there "making trouble" for the system doesn't stand a chance. Why? Because it would take a relentless, inhuman vigilance to battle those persistent and tidal forces bent on controlling the narrative … And most of us don't have the time, expertise, or energy to do that.

Also, because, for someone to make their case, they have to prove that good things have been said about them … by the establishment.

In other words, if The Washington Post likes you, then you appear in a positive light on Wikipedia. If it doesn't, then what the public sees on the platform is a person or entity it should apparently not like or trust.

In the second part (published a week later) Baker looks at his own Wikipedia biography, which he says is highly selective in a way that is unfavourable to him.

And yes, I received several scathing reviews from establishment organs, but my Wikipedia page never quoted any of the good ones I received — and from prominent people. In fact, they implied there were none. Here’s what you won’t find on Wikipedia:

  • One of the most important books of the past ten years. — Gore Vidal
  • An investigative gem filled with juicy revelations. — Sydney Schanberg, Pulitzer Prize winner, The New York Times
  • A tour de force… Family of Secrets has made me rethink even those events I witnessed with my own eyes. — Dan Rather
  • Russ Baker's work stands out for its fierce independence, fact-based reporting, and concern for what matters most to our democracy… A lot of us look to Russ to tell us what we didn't know. — Bill Moyers
  • This is the book people will be mining for years to come. — David Margolick, Newsweek and Vanity Fair

There is no reason to believe these quotes aren't genuine – Schanberg, for example, joined Baker for readings of the book in question. The problem is that these are "Praise for ..." quotes from a marketing blurb rather than quotes taken from published reviews. Wikipedians would generally avoid citing marketing materials, and look for independently published reviews in the press. So, is Baker merely whining?

Well, no. Reading the fairly sympathetic Boston magazine article quoted in Baker's biography, it's hard to escape the notion that editors selected quotes to construct a narrative completely at odds with the overall tenor of the cited article. The Boston article concludes by asking, in light of important stories broken by Baker in the past, "which is more dangerous, listening to Russ Baker, or ignoring him?" – AK

"Share profits with authors!"

OpenAI, the creators of the generative pre-trained transformer (of which ChatGPT is one) – reportedly paying its clickworkers in India and Africa $2 an hour

This is the provocative title of an article in Germany's Der Tagesspiegel newspaper, opining that generative models like ChatGPT that create text, images and music are committing "data theft" and leaving creators "naked". The article discusses the unsung contributions of the many:

Wikipedia authors, book authors, illustrators, editors, photographers. Their work creates the raw materials that then enters an industrial process: the training data used to feed the AI.

Tech companies like Google and Amazon have used free and open internet content like the English-language Wikipedia as a quarry for years, without giving the authors or organisations a share. The paltry sums that Google and Amazon donate to the Wikimedia Foundation are dwarfed by the economic benefit these corporations derive from the online encyclopaedia.

Now, it has always been an inherent flaw of the commons idea that profit-oriented actors are as welcome to benefit from non-profit work as the general public. The "tragedy of the commons" dilemma is well known from economics. One cannot forbid Amazon to train its voice assistant Alexa with Wikipedia texts – or Wikipedia would have to jettison its foundational principles overnight.

But the relentlessness with which tech companies graze the digital commons and use it to feed their own business models raises the question under what circumstances commons will continue to be produced in the future. Who will maintain Wikipedia articles if they are used for commercialised search queries or answer modules? Who will still write books if language models glue together set pieces into third-rate novels and publishers use them to fill their portfolios?

Der Tagespiegel proposes a compensation system that gives authors an appropriate share in AI systems' profits, citing Germany's long-established VG Wort (cf. Authors' Licensing and Collecting Society) as an example. This has authors registering and then being routinely compensated with fees collected from re-users of their works, according to a complex allocation formula.

After all, no one would have a street artist paint their portrait and then, after taking a digital picture of it and editing it with an AI-based Instafilter, tell the painter, without paying, "Thanks a lot, that was fun!" Respect for art is also expressed through decent payment.

(VG Wort has previously already indicated that Wikipedia would be eligible for payments under its existing system. However, in a 2011 poll the German Wikipedia community overwhelmingly voted against participating in the scheme as a website, although some individual Wikipedia editors were collecting payments separately for "their" articles, amounting to 300 Euros in one case. See previous Signpost coverage: "German Wikipedians reject author payments scheme".)

The Tagesspiegel article ends by noting that OpenAI, the developer of ChatGPT, employs thousands of clickworkers in Uganda, Kenya and India, who label potentially offensive text to help train the models (including violent or sexual text). In Kenya, where average wage is about $18 per day, these workers are paid less than $2 an hour. – AK

Wikipedia blocked and unblocked in Pakistan

As discussed in this issue's news and notes, Wikipedia is back in Pakistan after a fairly brief block. The Prime Minister found that "the unintended consequences of this blanket ban outweigh its benefits", and formed a new committee to look at technical measures for selectively blocking specific objectionable content. However, back in 2015 Wikipedia switched to HTTPS, specifically to make it more difficult for ISPs and other men-in-the-middle to know what part of Wikipedia you're browsing. Numerous reports were made on the subject, including Dawn (again here), Bloomberg News, ABC News, Al Jazeera, and NPR AC

Wikimedia Foundation vs. NSA

The Washington Examiner reports that next week, the United States Supreme Court justices will decide whether to take up the longstanding case between the Wikimedia Foundation and the National Security Agency. "To this day, no public court has determined whether upstream surveillance complies with the Constitution. If the government can obtain dismissal here, it will have every incentive to make overstated or exaggerated claims of secrecy to close the courthouse doors on suits like Wikimedia's – suits seeking accountability for government overreach or abuse in the name of national security," the article's author, Bob Goodlatte, states. – AK

UPDATE: The Supreme Court denied the Wikimedia Foundation's petition on 21 February 2023, marking the end of the case. – AK

In brief

Former Wikimedia CEO Katherine Maher
  • "Major search engines to alter results with AI": Former Wikimedia CEO Katherine Maher spoke in a five-minute interview to ABC News (Australia) about the likely impact of tools like GPT on search engines and Wikipedia.
  • Google Bard AI trained in part on Wikipedia: Search Engine Journal reports that Google's Bard AI is 12.5% based on English Wikipedia. Another 12.5% comes from Google's C4 Dataset ("Colossal Clean Crawled Corpus"), which apparently also includes Wikipedia as one of its main sources. The other 75% come "from the internet" but their precise origins are "murky", the article says – though the author makes some educated guesses.
  • Santos is funny: especially the disgraced US parliamentarian's official web link to Wikipedia, according to the Indiana Daily Student. The link just is to the 118th United States Congress which only mentions that Santos represents New York's 3rd district. The honorable representative might also want to link here or to previous coverage in The Signpost. Where's Randy Rainbow when we need him?
  • Wikipedia and its "outsize influence on judicial reasoning": Legal Futures (UK) says that the "widespread use of online source Wikipedia by senior judges could mean fake information spreading, leading to bad judgments, an update of research first revealed last year has warned".
  • Political "scrubbing": In a "flashback" article published on the occasion of the appointment of Jeff Zients as the White House's new chief of staff, Fox News recaps a 2020 report by Politico about how consulting firm Saguaro Strategies had "scrubbed" politically damaging information from the Wikipedia page about him (see also "In the media" from December 28, 2020).
  • Jimmy Wales comments on Online Safety Bill: IT Pro quotes comments by Jimmy Wales on the Online Safety Bill, a proposed UK law that was covered extensively in our previous issue (see Special report).
  • You got something against Bigfoot?: Slate says you do, you cryptid-phobe.
  • American Physical Society partnership: APS News says Wikipedia Has a Problem That Physicists Can Help Solve – a gender gap problem addressed through the Wiki Scientist Program. More about the program is at the WikiEdu blog.
  • Cry foul: A "former controversial umpire" for Major League Baseball may be editing his own Wikipedia page, according to The Sporting News [1]. The Signpost can confirm that a user account has been blocked and talkpage access revoked following a legal threat on the account's talkpage.



Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit next week's edition in the Newsroom or leave a tip on the suggestions page.


  NODES
chat 3
COMMUNITY 1
Idea 1
idea 1
INTERN 2
Note 5
Project 2
todo 1
twitter 1
USERS 1