Wikipedia:Wikipedia Signpost/Single/2019-11-29
Put on your birthday best
Please vote in the current ArbCom election, if you haven’t already. As of November 28, 1,483 voters have submitted a ballot, compared to 1,858 last year with 4 days left to vote. Ballots may be submitted until 23:59, 2 December 2019 (UTC).
Several indicators of Wikipedia’s progress will be celebrated over the next several weeks. The English-language Wikipedia is likely to mark its six-millionth article, sometime between New Year’s Eve 2019 and January 8, 2020. Wikipedia itself will mark its 19th birthday on Wikipedia Day, January 15, and The Signpost reaches its 15th birthday on January 10.
Like Wikipedia, The Signpost has been involved in a few disagreements but continues on an upward path. The Signpost, we are convinced, is the best place on Wikipedia for Wikipedians to write about, read about, and learn about Wikipedia. You can help us prepare for our upcoming birthday by contributing in many ways. No, we're not asking for money, but your participation in our little newspaper will ensure our continued success.
You can contribute in several ways:
- Comments – just add your thoughts to the comments section below the articles. Unlike many online publications, our comments sections are a feature, not a bug. Starting an organized, well thought-out discussion on important subjects is a major goal of our articles. Please let us know what you think.
- Suggestions – if you think we should cover a story but you don't want to write it yourself, drop a suggestion off at Wikipedia:Wikipedia Signpost/Newsroom/Suggestions, including as much solid information as you can such as links to Wikipedia pages or to news stories.
- "In brief" sections in "In the media", "News & notes", "Discussion report", and "Arbitration report" – if you'd like to write one or two sentences about a story that fits in these columns, please just write it up. We’ll fact check it and may extend the coverage if we find something else. Be sure to sign it or leave a short note if you want credit: for example, -S
- Adding a paragraph or more to a story – if you'd like to write more than a couple of sentences, please mention this at the newsroom talk page and we'll get you started.
- Submit an article – if it's your first submission to The Signpost it might be easiest to start a draft in your user space and drop us a note at the newsroom talk page. If the proposed article is sensitive, e-mail it to the editor-in-chief. Submitting the article via Wikipedia:Wikipedia Signpost/Newsroom/Submissions at least a week before the next issue is the usual method.
- Start a new regular column – let us know what you'd like to write about on a regular basis and we'll consider it. Right now it would be great to have a regular column on the cultural and academic sectors – call it GLAM+ if you'd like. We'd love to have a column written by a woman – but we don't want a "woman's column" – nothing like Better Homes and Gardens anyway. There are women Wikipedia editors who have different interests than most male editors and we’d love to offer something to these editors – but please don't assume that there won't be male editors interested in the same material. Regular columns on Wikipedians in Africa, Asia, Australia, Europe, or South America could work. Any proposals for new columns will be carefully considered.
- Join our copyeditors – by letting us know on the newsroom talk page.
- Become an editor – this involves writing, selecting stories, editing with the aim of improving content as well as presentation, basic copyediting, finding Wikipedia editors who have a story to tell, and helping the other editors in planning the newspaper's direction. In short doing everything that it takes to get this publication out every month.
- Any other improvements that you'd like to suggest and can work on – just let us know.
Our system of writing and publishing is a combination of individual work and group effort. You will get credit via a byline in most cases, but at least one other editor will check your work, and help you with fact checking and copyediting. The editor-in-chief will then check that Wikipedia's rules have been followed.
What are the rules that apply to writing for The Signpost? This is a WikiProject like many others, such as WP:Military history or the National Register of Historic Places WikiProject. Within broad limits we set our own rules, like how an article is approved for publication or how the editor-in-chief is selected. It is important to remember that we are not writing encyclopedia articles for the mainspace, but writing journalism for a newspaper. Journalistic standards apply as well as Wikipedia rules. The policy on not including original research does not apply to Signpost articles. We always strive to be fair and accurate in our news articles, but occasionally the exact wording of the neutral point of view policy may not apply. We encourage opinion and humor pieces as well as news stories.
The policy on biographies of living persons does apply to all pages on Wikipedia, but this does not mean that we can't write about administrators, paid editors, or any other editors who put themselves in the public eye. If an article meets journalistic standards and the text would be acceptable on another WikiProject, at the administrators' incidents notice board, or during public Arbitration Committee proceedings, then it will be acceptable on The Signpost when approved by the editor-in-chief.
There is another way that our regular readers can contribute to The Signpost. Some Wikipedians love to argue vociferously and at length about what many people consider to be minor matters. Some writers find it difficult to accept that one of their submissions has been rejected. Others love to argue about grammar. It would do wonders for the morale of our staff if readers would occasionally let these folks know that we are volunteers contributing our time and just trying to do our best. Reader participation really is the key to our future success.
There are some types of "contributions" that we do not accept. For example, sometimes a subject of an article decides that they are better qualified to report on themselves than our reporter is. This almost never works out. If you are the subject of an article and anything the least bit controversial is reported, then the reporter will contact you for your side of the story. Letting the article subject write the article itself will likely deprive our readers of other sides of the story. If you'd like to write an opinion piece about yourself, you may contact the editor-in-chief, but this type of article is not in great demand.
A particularly obnoxious "contribution" that we will never accept is from those people who try to inject their point of view into a news article, or into some other author's opinion piece in the couple of hours just before publication. There are multiple aspects of articles we have to check and recheck before publication. Interfering with this process is at best obstructionism. Uninvited submissions are generally not accepted during the day before publication. Trying to edit war your opinions into an article at this time is a form of censorship, and is simply unacceptable.
So please do consider how you can best contribute to our upcoming birthday celebration. We appreciate your support.
How soon for the next million articles?
English Wikipedia new milestones
There are currently 6,930,382 articles on Wikipedia. |
The English Wikipedia will reach six million articles around January 1 or perhaps a bit later, according to our estimate. Previous milestones are noted below.
Previous milestones | Date | Article |
---|---|---|
1 million | 1 March 2006 | Jordanhill railway station |
2 million | 9 September 2007 | El Hormiguero |
3 million | 17 August 2009 | Beate Eriksen |
4 million | 13 July 2012 | Ezbet el-Borg |
5 million | 1 November 2015 | Persoonia terminalis |
- Astronaut Christina Koch, aka User:Astro Christina made the first direct edit from space on November 17, 2019, correcting an article about a space walk that she had made. Earlier edits have been composed in space, but had been relayed through an Earth-bound editor. For example, in a November 2017 edit the first content made for Wikipedia in space—a voice recording—was published. Editor Darenwelsh, who is an extravehicular activity flight controller at NASA, helped organize the new edit and first reported it on Reddit. Darenwelsh and his coworkers have set up 10 wikis at NASA.
Brief notes
- Amazon donates $1 million.
- .ORG registry sold to for-profit company: .ORG domain names will be registered by a new for-profit company, which is likely to cost the WMF some money. Sj reports that Ethos Capital—which has two staff—is buying the non-profit Public Interest Registry (PIR) at an estimated price of a billion dollars. The Electronic Frontier Foundation objects, as does the Wikimedia Foundation and 25 other non-profits, in an open letter to the Internet Society.
- New user-groups: The Affiliations Committee announced the approval, on September 25, of three new Wikimedia movement affiliates, the Wikimedia Stewards User Group, Wikimedians of Santali Language User Group, and meta:Wikimedians of Saint Petersburg User Group.
- Tim Berners-Lee approves: Tim Berners-Lee's Contract for the Web received notice in The Guardian and The Verge. He also wrote an op-ed for The New York Times, saying the World Wide Web that he invented needs a re-do, although he lists Wikipedia as an example of what he's OK with.
- FRAMBAN denouement: WMF published a statement, "Community consultation on partial and temporary office actions".
- Translation pilot project: WMF is hiring up to six paid translators for each of these languages: Arabic, Chinese, French, German, Russian and Spanish.
- UNC Edit-a-thon: The University of North Carolina at Chapel Hill's Hussman School of Journalism and Media and the North Carolina Digital Heritage Center hosted a Newspapers in North Carolina Edit-a-thon on 24 October. The edit-a-thon sought to improve articles on newspapers in the U.S. state of North Carolina to give readers context for the sources they use, and to combat fake news.
- New administrators: The Signpost welcomes the English Wikipedia's newest administrators, EvergreenFir and ToBeFree (11 November 2019), Girth Summit (26 October 2019), Kees08 (14 October 2019), GermanJoe (6 October 2019), Nosebagbear (3 October 2019), and Barkeep49 (11 September 2019).
You say you want a revolution
"We all want to change the world"
External videos | |
---|---|
Revolution 1 (slow version), (4:15) | |
The Beatles – Revolution (fast version), (3:27) |
Everybody wants to change Wikipedia in some way. Our model of knowledge production and distribution depends on it. Be bold! If you see something in the encyclopedia you don't like, change it. Many people want to change the Wikipedia model and use it for purposes other than building an encyclopedia. Good luck to them!
But not all change is good. This month saw examples of people striving to systematically change the content of our encyclopedia, Wikipedians and others trying to tweak the Wikipedia model of many small content contributors and many small financial contributors, and governments trying to dictate what an online encyclopedia should look like.
- "If you go carrying pictures of Chairman Mao": How Baidu Baike has faced off against Wikipedia to build the world’s largest online Chinese encyclopaedia. The South China Morning Post, based in Hong Kong, reports on the on-going controversy surrounding the Chinese Wikipedia and compares it to Baidu Baike, the giant Chinese encyclopedia with more than 16 million entries owned by Baidu, the giant Chinese internet company.
- Baidu Baike offers its editors prizes, experience points, and wealth points in order to incentivize text contributions. Entries are reviewed before publication to filter out "reactionary content", racial and religious provocations, and the promotion of superstitions and cults. Advertisements, porn, fraud and gambling promotion are banned.
- Some Chinese editors prefer Wikipedia because of the difference in review processes. According to the SCMP, one editor said "The operation process at Wikipedia is transparent – you can see why entries are published or deleted. At Baidu, the [review] process is in a 'black box'."
- Hong Kong editors respond by editing articles on the Hong Kong Police Force, the current protests, and Carrie Lam. Reuters has published an interactive graph that displays the editing activity to show how the Hong Kong Police Force page has changed over time.
- Previous coverage in The Signpost: Chinese Wikipedia and the battle against extradition from Hong Kong, The BBC looks at Chinese government editing, Carl Miller on Wikipedia Wars, and Observations from the mainland
- "You ain't gonna make it with anyone anyhow": For the third month in a row, the media have announced Russian President Vladimir Putin's proposed Wikipedia replacement as if it were something new. Putin says the Russian language is being attacked by "cave-dwelling Russophobes". He says "It would be better to replace it (Wikipedia) with the Great Russian Encyclopedia in electronic form. At least this would be credible information." If you don't read the stories very closely, you might think that Putin is proposing a new online encyclopedia designed to replace the Russian Wikipedia with neutral and more reliable content, strongly supported by the Russian state budget.
That would be completely correct except:
- The Great Russian Encyclopedia, which traces its predecessors back to the 1926 Great Soviet Encyclopedia, has published in print since 2004.
- The GRE has been publishing on-line since 2016.
- It will continue to be written by experts associated with the Russian Academy of Sciences and published by the RAS.
- The state funding increase is only about $10 million per year for 3 years.
- If you have a calm and conservative attitude, you might instead conclude, along with Moscow Times contributor Ilya Klishin, that the additional funding is just a way to siphon off state funds to favored individuals. Or, if you have a less trusting attitude, perhaps by reading stories in Euronews and Reporters without Borders, you might conclude that the attacks on Wikipedia could be related to Russia's contingency planning to separate itself from the outside world's internet.
- "You better free your mind instead": Finding Truth Online Is Hard Enough in The New York Times Magazine is a story on internet censorship in Turkey, of which a ban on Wikipedia is only a part. The entire issue of the magazine is on the future of the internet. Over the last decade the Freedom House's index of Turkey's internet freedom has fallen sharply, following Russia's index level down, but still well above China's. Different countries experience censorship differently and this report emphasizes the quirks and contradictions of Turkey's experience.
- "You say you got a real solution": The Financial Times (paywalled) reports in Wikipedia co-founder Jimmy Wales launches Twitter and Facebook rival that WT:Social is operating as a social media site, trying to avoid "clickbait", misleading headlines, and the other flaws of Facebook. The site had about 50,000 users as of a few weeks ago, but "obviously the ambition is not 50,000 or 500,000 but 50m and 500m" says Jimmy. Users and wait-listers number about 270,000 as of November 27. The BBC podcast "Tech Tent" interviews Jimmy (starts at 1:35), who says that the site is not optimized for addiction.
- "You tell me it's the institution": National Library of Wales to Lead on Welsh Wikipedia Project in Business News Wales. The library in Aberystwyth will prepare Welsh-language material from Wikipedia for the 100 most important topics and themes in the school curriculum in Welsh history. Much of the work will be done by Jason Evans, who is the library's National Wikipedian, presumably a promotion from Wikipedian-in-residence.
- Meanwhile, folks at the Milner Library at Illinois State University have completed their own revolution by connecting Wikipedia's List of African-American writers to the library catalog records.
- Taking it a step further, ISU archivist April Anderson-Zorn and grad student Stephanie Collier document women and minority archivists on Wikipedia, including Mary Lynn Ritzenthaler, Brenda Banks, Sara Dunlap Jackson, and Kathleen D. Roe. "It’s really important that we get this information out there; we fight to make sure that all voices are heard," said Anderson-Zorn.
- "We all wanna see the plan": Larry Sanger in Introducing the Encyclosphere states that "If you want to participate in the world's largest encyclopedia, you must collaborate with a shadowy group of anonymous amateurs and paid shills ... If you're new, you won't be treated very nicely. If you don't play their strange game, you'll be summarily dismissed. Like the social media giants, Wikipedia has become an arrogant and controlling oligarchy." Sanger's blog post includes an 18 minute lecture.
- "You say you'll change the constitution": Building a More Honest Internet in the Columbia Journalism Review divides the internet into three business models: the main current global model that they call "surveillance capitalism", the Chinese and Russian model "where the unfettered capitalism of the US internet is blended with tight state oversight and control", and the "public service" model, which is exemplified by Wikipedia.
- "We all wanna change your head": How the Iranian Government Shut Off the Internet. According to Wired, Netblocks calls the impact "near total".
- "Don't you know it's gonna be all right":
- Young Ghanaians create Wikipedia pages on migration at an edit-a-thon in Accra. The articles created include information on how to properly apply for visas to Germany. Funding was supplied by the German Foreign Office.
- Is Wikipedia telling the truth? Addressing systemic bias in Wikipedia entries in The Daily of the University of Washington reports on an Editing is Activism: Edit-a-Thon about the Seattle General Strike of 1919. According to the The Daily, four articles were created at the edit-a-thon, eighteen other articles were edited, and 44 references were added.
- Re-editing Wikipedia in the name of Pacific Northwest womxn: Perhaps The Daily is getting a bit carried away. Another edit-a-thon, this one on women artists, resulted in seven new articles, 17 other articles edited, and 66 added references. A participant explained "We want to address omissions in history as a form of social justice activism."
- "You ask me for a contribution/ Well, you know/ We're all doing what we can":
- Nobody has started writing about the usual end-of-the-year Wikimedia fundraising campaign yet, not even the WMF, but expect it this month. So whatever type of revolution you want, you can decide on whatever type of contribution you can make. All right? All right.
- Readers' comments are requested below in the talk section. How much "social justice activism" is acceptable on Wikipedia? How much governmental or institutional participation? How much revolution? Or should we all park our consciences at the door before editing?
In brief
- Rise of the bots: Stevens team completes first census of Wikipedia bots: Researchers at the Stevens Institute of Technology analyze Wikipedia's 1,601 bots and how they interact with human editors. Altogether bots account for about 10 percent of edits. The researchers divide bots into 9 types, including "fixers", "protectors", "connectors", and "advisors". The 1,200 fixer-bots are the most active type, but advisor- and protector-bots are especially important in interacting with human editors.
- Argentine Wikipedian-in-Residence Mauricio Genta: La Nacion (in Spanish) reports that Genta works at both the Circe library and the National Academy of History. His watchlist includes 1,200 articles. His work in digitization supports not just Wikipedia and Wikimedia Commons, but also Wikidata, Wikisource, Wikibooks, and Wikivoyage. His personal editing, which began after he attended Wikimania in 2009, focuses on transportation articles.
- Geschlechterungleichheit!? Oh Nein!: St. Galler Tagblatt writes (in German) on Wikipedia's gender inequality (Geschlechterungleichheit). Women have a hard time on Wikipedia in Switzerland as well as in other countries.
- Katherine Maher stays on message: As the video suggests, Maher always stays on message.
- Wikipedia the latest battleground in Lebanon's protests in Arab News. Grievances in Lebanon over perceived government corruption amid an economic crisis have led to a wave of protests which, in addition to triggering the resignation of Prime Minister Saad Hariri, have apparently led to some IP vandalism of the article about the Parliament of Lebanon. A user changed a section heading to "Lebanese Robbery", the type from Unicameral to "Unicameral (useless altogether)", and added a comment accusing the members of "contribut[ing] one way or another in keeping the rotten system alive". The edits were quickly reverted before an admin protected the page.
- An edit war erupts in the Himalayas: After India published a new map of the Kalapani territory, Nepali editors objected. Total edits to the article almost doubled from about 125 to 220 within two weeks. Three maps were added to the article and OpenStreetMaps was linked. The size of the article quadrupled. The article is now semi-protected, but that doesn't seem to be helping.
- The Most Popular Wikipedia Pages, 2007–2019, an eight minute video, shows how the top 12 most viewed articles changed over 12 years. The bars in the bar graphs dance up and down, as "Barack Obama" rises to the top early on, but is later overtaken by "United States" and inevitably by "Donald Trump". About half the articles listed at any time are about entertainment, mostly singers. "Lil Wayne" was popular early on but fell quickly as "The Beatles" rose, who in turn fell below "Michael Jackson", "Lady Gaga", "Eminem", and "Game of Thrones."
- Our crowning achievement: The Washington Post follows how ‘The Crown’ has returned, and royal Wikipedia pages are exploding in page views.
Odd bits
- Scorsese's The Irishman doesn't make the list: With only 136 uses of the term, The Irishman (2019 film) doesn't make it onto the List of films that most frequently use the word "fuck". The current lower cutoff is 150, but there are at least 15 uses of an obscene phrase, which may make up for any other shortcomings.
- Maren Morris Sings Jimmy Kimmel a Birthday Song on Jimmy Kimmel Live (via YouTube). The lyrics of the song were taken from the Wikipedia article on Jimmy Kimmel, with some scripted confusion at the end including lyrics from the article on Jimmy Carter. Two questions for all the copyright experts reading this: Since the lyrics are all licensed CC BY-SA 3.0, is the entire song including the music now licensed CC BY-SA 3.0? Is the video clip on YouTube also licensed CC BY-SA 3.0? Later the same evening Morris won the Country Music Awards Album of the Year for Girl.
- Socked Into the Puppet-Hole on Wikipedia. Wired columnist Noam Cohen laments the loss of Noam Cohen, then is bemused by the article's reinstatement.
- The Ministry of Wiki-Truth in Dissident Voice: A radical newsletter in the struggle for peace and social justice. C.J. Hopkins discusses edits to C.J. Hopkins. He gets bonus points for the article title.
What's making you happy this month?
There are many opportunities to discuss bad news, problems, and concerns in the Wikiverse, and I think that having candid discussions about these issues is often important. Many days I spend more time thinking about problems than about what is going well. However, also I think that acknowledging the good side and taking a moment to be appreciative can be valuable.
I encourage you to add your comments about what's making you happy this month to the talk page of this Signpost piece.
Week of 3 November 2019: Kva gjer deg glad denne veka?
Job openings
- Shared by Lydia Pintscher (WMDE): Product Manager for Wikibase
- Shared by Nemo: Biodiversity Heritage Library Program Manager for the Smithsonian Libraries
For your listening enjoyment
Images from Norway
-
From the main page of Norsk Wikipedia: a house martin (Delichon urbicum)
Wikipedia for public good
Approaching a milestone
Week of 10 November 2019: ߡߎ߲߬ ߞߵߌ ߟߊ߫ ߛߍߥߊ߫ ߞߎ߲߬ߢߐ߮ ߣߌ߲߬ ߞߘߐ߫؟
New Wikimedia affiliates
Wiki Loves Monuments national winners
Some of the national winners of the annual Wiki Loves Monuments competition that have been added to this page.
-
Albania
-
Armenia
-
Australia
-
Azerbaijan
-
Bangladesh
-
Bolivia
-
Brazil
-
Canada
-
Croatia
-
Denmark
-
France
-
Germany
-
Greece
-
India
-
Ireland
-
Kosovo
-
Malta
-
Morocco
-
Nepal
-
Nigeria
-
Peru
-
The Philippines
-
Poland
-
Portugal
-
Romania
-
Slovakia
-
Slovenia
-
Spain
-
Sweden
-
Thailand
-
United Kingdom
Humor
Even those who are experienced at public communications worry about making big mistakes. This video (YouTube link), from the British political satire Yes, Prime Minister, shows what happened when the Prime Minister learned that Sir Humphrey, the head of the Home Civil Service, made an indiscreet comment that was recorded by a BBC microphone. I feel some sympathy for Humphrey because occasionally I too say something that I wish that I hadn't!
Week of 17 November 2019: Ce te face fericit săptămâna asta?
"The Wait"
This is a short film regarding wildlife photography. The film has scenes of European bison in Romania. The narration is in French, and English subtitles are available. I think that wildlife photographers for Wikimedia Commons will find this film to be relatable. https://vimeo.com/180080686
-
A drawing of a European bison in the Cave of Altamira, Spain. The cave is a UNESCO World Heritage Site. In 1879, amateur archaeologist Marcelino Sanz de Sautuola was led by his daughter María to discover the cave's drawings. One study estimates that the oldest drawings were made approximately 36,000 years ago.
-
Bison bonasus sparring in a nursery of the Russian Academy of Sciences in Shebalinsky District, Republic of Altai, Russia.
Week of 24 November 2019: ما الذي يجعلك تشعر بالسعادة هذا الأسبوع؟
Milestone on Arabic Wikipedia
Wikimedia Technical Conference reports
- Daily reports by User:DTankersley (WMF)
- Math report by User:physikerwelt
English Wikiquote of the day for 10 November 2019
All writers … have an obligation to our readers: it's the obligation to write true things, especially important when we are creating tales of people who do not exist in places that never were — to understand that truth is not in what happens but what it tells us about who we are. Fiction is the lie that tells the truth, after all.
— Science fiction and fantasy author Neil Gaiman of England
Recent featured media on English Wikipedia and Wikimedia Commons
-
Replica of a compass rose from a chart of Portuguese cartographer Jorge de Aguiar that was made in the year 1492
-
Two Lovers Beneath an Umbrella in the Snow, by Suzuki Harunobu, color woodblock print in the Ukiyo-e style, circa 1767
-
The Roman Baths (Thermae) of Bath Spa, England. Buildings on the site have been created and modified several times from circa 60–70 A.D. onward.
-
Hippopotamus statuette from circa 1961–1878 B.C., during the reigns of Senwosret I to Senwosret II, in Egypt. According to the Metropolitan Museum of Art's Bulletin, this statuette is a "particularly fine example of a type found... among the funerary furnishings of tombs of the Middle Kingdom" and also an exemplary piece of Egyptian faience.
Regarding translations
Skillful translations of the sentence "What's making you happy this week?" would be very much appreciated. If you see any inaccuracies in the translations within this article then please {{ping}} User:Pine in the discussion section of this page, or boldly make the correction to the text of the article. Thank you to everyone who has helped with translations so far.
Your turn
What's making you happy this month? You are welcome to write a comment on the talk page of this Signpost piece.
Two requests for arbitration cases
Two requests for arbitration committee cases were filed at Wikipedia:Arbitration/Requests/Case in November. One was withdrawn and one has been accepted.
Salted unknown article, oversight block
A new request, "Drmies salting", was initiated by Wumbolo on November 9. The request was about a mainspace article that has never been created
and has been salted by an administrator in order to prevent its creation. In place of the actual title, the placeholder "XYZ" was used in the request. The userpage note left for the filing party in conjunction with the salting has been oversighted and The Signpost has no further details on the page's contents.
The case request was closed as "withdrawn" by a clerk on November 10. The same day, Wumbolo was oversight blocked indefinitely. The Signpost has no further details on the reason for the user block.
Important or imprudent? Pondering portals.
A new request, "Conduct in portal space and portal deletion discussions", was initiated by ToThAc on November 18. ToThAc described the issue as follows:
As summarized in Robert McClenon's essay on issues surrounding portals, the necessity of portals in general has been heavily debated over the course of several months. In April 2018, The Transhumanist started an RfC on deprecating portals, which was closed with a rough consensus to not delete all portals.
The complainant said that despite the prior RfC, uncivil discussion of individual portal creations and deletions has ensued, and named 20 other involved parties.
The case was accepted and opened, with arbitrator Joe Roe commenting This has proved to be a long-running and intractable dispute.
The queen and the princess meet the king and the joker
- This traffic report is adapted from the Top 25 Report, prepared with commentary by A lad insane (November 17 to 23) and Igordebraga (November 3 to 9, November 10 to 16, November 17 to 23).
What is this, an episode of Downton Abbey? There's certainly enough royalty to go around (November 17 to 23, 2019)
Similar to last week, Star Wars is present (#1), another TV show makes an appearance at #2. Unlike last week, however, there are an ungodly number of royals peppering this list – in fact, another TV series, The Crown, is responsible (#3, #4, #5, #8, #9).
Rank | Article | Class | Views | Image | About |
---|---|---|---|---|---|
1 | The Mandalorian | 1,707,701 | Similar to another time this list was heavily dominated by one subject, a Star Wars-related subject takes top ranking; however, this time that wasn't the primary topic on the list. Disney+'s debut, accompanied by this original TV series set in the Star Wars universe, has received positive ratings. | ||
2 | Caitlyn Jenner | 1,570,820 | Formerly known as decathlon gold medalist and Kardashian–Jenner patriarch Bruce Jenner, Caitlyn has decided to go across The Pond and survive in the jungle on the reality show I'm a Celebrity...Get Me Out of Here! | ||
3 | Princess Margaret, Countess of Snowdon | 1,492,698 | The Crown has returned, and thus again there's a views spike for the daughters of George VI. Given the show has skipped from the 1940s to the 1960s, the actresses have changed to two women who have played crazy queens: Margaret is now Red Queen Helena Bonham Carter and Elizabeth changed to Queen Anne Olivia Colman. | ||
4 | Elizabeth II | 1,266,912 | |||
5 | Aberfan disaster | 1,212,970 | Not everything The Crown brings into this list is old or dead royalty, it turns out. | ||
6 | Fiona Hill (presidential advisor) | 1,156,864 | Americans are divided into two factions right now: those who are eagerly hanging onto every word of the impeachment inquiry, and those who would like to end all of those responsible for the 24-hour news cycle and all the prolonged impeachment inquiries on it. The first group has propelled this article to its position here, as they rush to find out who the hell this person is anyway. | ||
7 | Frozen II | 1,113,692 | This list has something for the child inside all of us, for if this isn't your style there is, of course, Fred Rogers down at #23, having been portrayed by Tom Hanks (and as Weird Al said, nobody doesn't like Tom Hanks!) Here, though, Elsa and Anna return to travel on a magical, icy journey to discover happy things, because Disney likes happy things. | ||
8 | Princess Alice of Battenberg | 1,073,049 | More The Crown. One would think people wouldn't Wikipedia the name of the person in a TV show they're watching, for that would count as a spoiler – but that doesn't seem to be the case. In an alternate universe, this show was probably why spoiler warnings were deprecated: a lengthy RfC concluded with the consensus "no we will not put spoiler tags on an actual real-used to be alive person, and y'all can't figure out how to do that, so no more spoiler tags". | ||
9 | Harold Wilson | 1,072,674 | The Prime Minister during the period portrayed in the latest season of The Crown. | ||
10 | Prince Andrew, Duke of York | 1,008,200 | With the multitude of other royals brought here by The Crown (#18, and a third of the rest) one might think this is simply another case of TV fever, but no, Jeffrey Epstein (#13) and his entourage of criminals brought a royal down with them. Andrew's allegations that he couldn't possibly have been the person his accuser referred to – for he simply couldn't sweat, and she said he did! – were shot down by some photos (and I hate the Mirror too, don't worry) and little princey's birthday was cancelled, not to mention the whole "being kicked out of Buckingham Palace" issue. |
The Force is strong with this Report (November 10 to 16, 2019)
Even if late due to a delay with the WP:5000, the Report is actually early with a topic: Star Wars got a #1 one month before it is supposed to with the Disney+ (#16) series The Mandalorian (#1, #11), and is also present in a new video game (#20). Aside from eight pages that remained from last week, there's politics (#7), television (#8).
Rank | Article | Class | Views | Image | About |
---|---|---|---|---|---|
1 | The Mandalorian | 1,570,155 | Proving that in spite of a divisive Episode VIII and an underwhelming spin-off, Star Wars still moves the masses: a Disney+ (#16) series by Jon Favreau starring a bounty hunter managed to top our list. | ||
2 | Joker (2019 film) | 1,029,042 | The list of movies which have grossed $1 billion in 2019 finally has a non-Disney production with this take on the clown supervillain. Readers must also be curious about the "Future" section, given how rarely studios decide not to follow such acclaim and popularity with sequels. | ||
3 | Death Stranding | 803,135 | Hideo Kojima doesn't need Konami to make successful video games, as this PlayStation 4 action game set in a post-apocalyptic USA, whose cinematic values go down to having actors such as Norman Reedus, Mads Mikkelsen and Léa Seydoux, got great reviews and good sales. | ||
4 | Deaths in 2019 | 723,908 | I never thought I'd die alone I laughed the loudest, who'd have known... | ||
5 | Henry V of England | 669,639 | Henry the Fifth at fifth place, how appropriate. The high views are due to the Netflix movie The King. | ||
6 | John Demjanjuk | 587,622 | Still on Netflix, documentary series The Devil Next Door tells the story about a Ukrainian-American autoworker who allegedly worked as a guard at Nazi extermination camps during World War II. | ||
7 | Marie Yovanovitch | 585,578 | Yovanovitch was the United States Ambassador to Ukraine. Trump's involvement with that country's government is having repercussions for him. Yovanovitch was ousted after an alleged smear campaign, and is now testifying in the impeachment inquiries against Trump. | ||
8 | Rick and Morty (season 4) | 547,199 | Adult Swim has debuted the newest episodes of this time-and-space-hopping cartoon starring a mad scientist and his grandson. | ||
9 | Doctor Sleep (2019 film) | 544,211 | Decades after The Shining, a traumatized Danny Torrance (now played by Ewan McGregor) tries to save a girl with similar powers from people who literally feed on said "Shining" kids. Doctor Sleep got good reviews for being atmospheric, well-acted and spooky (even if, on this writer's opinion, a bit long and slow), but has struggled on the box office, having barely recouped its $55 million budget worldwide. | ||
10 | Jeffrey Epstein | 523,050 | Jeffrey Epstein has become the Internet's newest "tree-fiddy" – wedged into every unexpected nook and cranny, the message awaits: "Epstein didn't kill himself". While the real-life events that transpired in his cell that night remain a matter of conjecture, the popular opinion is certainly clear. |
And it's hard to watch some Netflix, in the cold November rain (November 3 to 9, 2019)
The eleventh month started spearheaded by the subjects of two streaming productions, meaning Netflix is the cause of boosted views for 15th century English kings and World War II Soviets working with the Nazis. More history is found in the 30th anniversary of the Berlin Wall's fall (#14), men who inspired holidays (#24) and Google Doodles (#21), a land dispute to be settled (#12) and a battle to be documented by Bollywood (#7). Speaking of movies, #3 and #5 are the same Hollywood blockbusters from last week(s), now joined by a horror flick (#15) and an actor (#10) who found love (#9) years after a tragedy (#19). The recently deceased (#4, #11), video games (#6), books adapted by HBO (#13), MMA (#18, #20), politicians from both sides of the Pond (#16, #17), and a changed landmark (#23) close the list.
Rank | Article | Class | Views | Image | About |
---|---|---|---|---|---|
1 | Henry V of England | 1,803,080 | One of the royal Henrys chronicled in Shakespeare's plays, which in turn are now adapted in the Netflix production The King. | ||
2 | John Demjanjuk | 948,250 | Still on Netflix, the documentary series The Devil Next Door tells the story about a Ukrainian-American autoworker who allegedly worked as a guard at Nazi extermination camps during World War II. | ||
3 | Joker (2019 film) | 940,728 | 10 years after the Batman movie that made everyone just want to talk about the Joker broke a billion dollars worldwide, a movie just about Gotham's clown sociopath is nearing a ten digit gross as well. Seems like everyone wants to dance with the devil in the pale moonlight. | ||
4 | Jeffrey Epstein | 918,694 | The possibility that the deceased pedophile financier was killed instead of having hanged himself has become an online meme. | ||
5 | Terminator: Dark Fate | 916,532 | In spite of being better than what you'd expect from a movie with a 63 year old female gunslinger and a 72 year old killer robot, Terminator: Dark Fate has not enthralled audiences so much (it opened atop the box office but now has fallen to #5, and only broke $200 million worldwide so far) and possibly won't get any follow-ups. As a fan of this series even if I had objections to some things in Dark Fate, it saddens me to see the franchise terminated. | ||
6 | Death Stranding | 796,900 | Hideo Kojima is back (while Konami continues to neglect his best known work) with this PlayStation 4 title whose cinematic values go down to having actors such as Norman Reedus, Mads Mikkelsen and Léa Seydoux. | ||
7 | Third Battle of Panipat | 796,729 | Bollywood released the first trailer for Panipat, which in December will re-enact this 1761 confrontation against an invading Afghan army. | ||
8 | The King (2019 film) | 791,674 | Timothée Chalamet plays our #1 in this Netflix movie. | ||
9 | Alexandra Grant | 771,172 | Possibly the most popular actor of the year, Keanu Reeves has reportedly been dating an artist with whom he has already written two books, making all his fans very happy that "Sad Keanu" might be a thing of the past. | ||
10 | Keanu Reeves | 761,249 |
Exclusions
- These lists exclude the Wikipedia main page, non-article pages (such as redlinks), and anomalous entries (such as DDoS attacks or likely automated views). Since mobile view data became available to the Report in October 2014, we exclude articles that have almost no mobile views (5–6% or less) or almost all mobile views (94–95% or more) because they are very likely to be automated views based on our experience and research of the issue. Please feel free to discuss any removal on the Top 25 Report talk page if you wish.
Reference things, sister things, stranger things
Previews of references
A new beta feature has been deployed which allows you to preview references by hovering over the inline footnote. Reference Previews display the reference and its type in a popup with a link to navigate to the reference. Similar functionality has been available through gadgets on several wikis, such as Navigation popups and Reference Tooltips.
Sudden script stoppages, semicolons, and other strangeness
Several userscripts stopped working suddenly on October 21, as reported on the technical Village Pump. This was due to code being deprecated and removed over a shorter timeframe than usual and without much forewarning. Following this incident, XFDcloser was made into a gadget.
The Contributions special page had its limit of 5,000 results per page decreased to just 500. The lower limit was implemented to prevent potential denial of service attacks due to the impact of certain queries with long date ranges.
Extraneous semicolons started appearing on the Watchlist and other listings on November 8. The issue arose from improvements to the mobile watchlist that unexpectedly impacted the desktop view.
Little sisters' wishes
The Community Wishlist Survey 2020 is open for voting until December 2nd. The WMF's Community Tech team, as per previous years, will work on the top wishes decided on by the community. Unlike previous years, this survey is exclusively for the smaller sister projects, with wishes for Wikipedia, Commons, and Wikidata specifically excluded. There are a total of 72 wishes in the survey, mostly for Wikisource, Wiktionary, and Wikiversity.
In brief
- The 2019 Wikimedia Technical Conference for developer productivity was held from November 12–15 in Atlanta, Georgia. Daily summaries with links to session notes and further reading are available from the Wikitech-l mailing list: day 1, day 2, day 3, day 4.
- Popular user scripts: Wikipedia:User scripts/Most imported scripts now lists user scripts by the number of times they have been imported. Many of the top scripts are broken, obsolete, deprecated, or superseded by gadgets.
- Gadget proposal: At the village pump, there's a proposal to make an updated version of the prosesize user script into a gadget.
Bot tasks
Recently approved tasks
- DannyS712 bot III (task · contribs) – Patrol redirects where periods are the only difference between the redirect title and the _target
- YiFeiBot (task · contribs) – Wikipedia:WikiProject Guild of Copy Editors/Requests archival
- BHGbot (task · contribs) – When a portal has been deleted at MFD, remove or replace links to it which are generated by one of 4 templates
- MilHistBot (task · contribs) – Assess articles that need a full B-Class checklist.
- WOSlinkerBot (task · contribs) – Fix lint errors related to {{legend2}}
- HasteurBot (task · contribs) – Nominate for Speedy Deletion articles that are valid for CSD:G13
- DannyS712 bot (task list · contribs)
- task 59 – Update various Wikipedia:Database reports
- task 62 – Tag talk pages that meet specific requirements as being within the `Maritime-task-force` and/or `Aviation-task-force` of wikiproject military history
- task 47 – Help implement TfD closes
- PearBOT (task list · contribs)
- Pathbot (task · contribs) – Adding navigation boxes in the bottom of articles.
- OAbot (task · contribs) – Add and maintain supported identifiers to citation templates (mostly {{cite journal}})
Current requests for approval
- Open:
- MajavahBot (task list · contribs) – Patrol WP:EFFPR: adding messages, fixing typos or formatting, and archiving old reports
- DemonDays64 Bot (task list · contribs) – Replace HTTP with HTTPS in links to HTTPS-secured sites
- Monkbot (task · contribs) – replaces the various
{{xx icon}}
templates (and their redirects) with{{in lang}}
- SteveBot (task · contribs) – Update categories based on results of requested moves
- In trial:
- Creffbot (task list · contribs) – Maintain CAT:UAA by removing the template from users who have had the template applied for a long time and users who are blocked
- Xinbenlv bot (task list · contribs) – Notify (on Talk page) cross language inconsistency for birthdays
- SportsStatsBot (task · contribs) – Automatically update football (soccer) players' career statistics
- Trial complete:
Latest tech news
Latest tech news from the Wikimedia technical community: 2019 #45, #46, #47, & #48. Please tell other users about these changes. Not all changes will affect you. Translations are available on Meta.
Recent changes
MediaWiki:ipb-default-expiry
can set the default length to block a user for your wiki. You will be able to useMediaWiki:ipb-default-expiry-ip
to set a different default block length for IP editors. [1]- MediaWiki2LaTeX can create a PDF document containing pages from a Wikimedia wiki. Previously limited to around 800 pages, it now supports approximately 5,000 pages.
- The mobile beta mode will be disabled to reduce maintenance. The developers will focus on the desktop improvements project. You can turn on advanced mobile contributions mode if you want to see the page categories. To restore the ability to return to the top of an article via a link, you can use a gadget or user script. [2]
- Parsoid is software we use for the visual editor, content translation, Flow and the Android app. This has been rewritten. It will come to the wikis gradually over the next two weeks. It has been tested, but there could be some diffs or previews that don't look right. If you see any you can report them. [3]
Future changes
- Wikimedia will take part in Google Code-in. This is for young students who want to help with open source software. See MediaWiki's Google Code-in page for more details. Experienced technical Wikimedians can mentor students.
- Similar to the desktop view, in future tabs will be used in the mobile view to switch between an article and its talk page. [4]
Meetings
- You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting takes place every Wednesday from 4:00–5:00 p.m. UTC. See MediaWiki's page on the Technical Advice meeting for instructions on how to participate.
Winter and holidays
Revisiting last December's "Sun and Moon, water and stone" solstice theme, we present some interesting and unusual winter and holiday images. We hope you enjoy them as much as we did.
Bot census; discussions differ on Spanish and English Wikipedia; how nature's seasons affect pageviews
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
"First census of Wikipedia bots"
- Reviewed by Indy beetle and Tilman Bayer
A paper titled "The Roles Bots Play in Wikipedia", published in Proceedings of the ACM on Human-Computer Interaction by five researchers from the Stevens Institute of Technology[1] was presented at this month's CSCW conference. Bots are a core component of English Wikipedia, and account for approximately 10 percent of all edits as of 2019. After retrieving all 1,601 registered bots (as of 28 February 2019), the researchers used a procedure involving machine learning to organise them into a taxonomy with nine key "roles":
- Generator, e.g. Rambot
- Fixer, doing tasks such as correcting typos or adjusting links, e.g. Xqbot
- Connector, connecting English Wikipedia to other wikis or external sites, e.g. Citation bot
- Tagger, adding and modifying templates and categories
- Clerk, "updating statistical information, documenting user status, updating maintenance pages, and delivering article alert to Wikiprojects", e.g. WP 1.0 bot
- Archiver
- Protector, e.g. COIbot, XLinkBot, and ClueBot NG
- Advisor
- Notifier
Some bots act in several roles (e.g. AnomieBOT as Tagger, Clerk and Archiver).
The last part of the paper concerns the impact of bots on new editors that they interact with. Extending previous research that had found increased retention for newbies who were invited to the Teahouse support space by HostBot, an "Advisor" bot, the researchers show that other Advisor bots have a significant positive effect as well (although in the example cited, SuggestBot, they may have wanted to mention as a confounding factor that users need to opt into receiving its messages). Likewise confirming previous research, messages from ClueBot NG were found to have a negative effect, but this wasn't the case for other "Protector" bots: "The newcomers seem to not care about the bot signing their comments (SineBot) and are even positively influenced by the bot reverting their added links that violate Wikipedia’s copyright policy (XLinkBot)."
A press release, titled "Rise of the bots: Team completes first census of Wikipedia bots", quoted one of the authors as saying "People don't mind being criticized by bots, as long as they're polite about it. Wikipedia's transparency and feedback mechanisms help people to accept bots as legitimate members of the community."
The authors note the relevance of Wikidata to their study, where the proportion of bot edits "has reached 88%" (citing a 2014 paper), and find that the move of interlanguage link information to Wikidata lead to a decrease in "Connector" bot activity on Wikipedia. At last year's CSCW, a paper titled "Bot Detection in Wikidata Using Behavioral and Other Informal Cues"[2] had presented a machine learning approach for identifying undeclared bot edits, showing that "in some cases, unflagged bot activities can significantly misrepresent human behavior in analyses". In the present study about Wikipedia, it would have been interesting to read whether the authors see any limitations in the data source they used (Category: All Wikipedia bots).
Seasonality in pageviews reflects plants blooming and birds migrating
- Reviewed by Tilman Bayer
A paper in PLoS Biology[3] uses Wikipedia pageview data for "the first broad exploration of seasonal patterns of interest in nature across many species and cultures". Specifically, the researchers looked at the traffic for articles about 31,751 different species across 245 Wikipedia language editions. They found "that seasonality plays an important role in how and when people interact with plants and animals online. ... Pageview seasonality varies across taxonomic clades in ways that reflect observable patterns in phenology, with groups such as insects and flowering plants having higher seasonality than mammals. Differences between Wikipedia language editions are significant; pages in languages spoken at higher latitudes exhibit greater seasonality overall, and species seldom show the same pattern across multiple language editions." Seasonality was often found to "clearly correspond with phenological patterns (e.g., bird migration or breeding...)", but in other cases also to human-made events such as annual holidays. For example, traffic for the English Wikipedia's article on the wild turkey (Meleagris gallopavo) spiked during Thanksgiving in the US, and saw a softer peak during "the spring hunting season for wild turkey in many US states."
Overall, articles about plants and animals exhibited seasonality much more often than articles about other topics. (Concretely, 20.2% of the species articles in the dataset were found to have seasonal traffic, compared to 6.51% in a random selection of nonspecies articles. One quarter of species had a seasonal article in at least one language. Technically, seasonality was determined via a method that involved, among other steps, fitting the pageviews time series to a sinusoidal model with one or two annual peaks, using a manually defined threshold.)
See also earlier coverage of a related paper involving some of the same authors: "Using Wikipedia page views to explore the cultural importance of global reptiles"
Editor Interactions in Spanish and English Wikipedia
- Reviewed by Isaac Johnson
"How Does Editor Interaction Help Build the Spanish Wikipedia?" by Taryn Bipat, Diana Victoria Davidson, Melissa Guadarrama, Nancy Li, Ryder Black, David W. McDonald, and Mark Zachry of University of Washington, published in the 2019 CSCW Companion, examines talk page discussions in Spanish Wikipedia with a specific eye to how they might be different from the types of interactions in English Wikipedia.[4] It replicates work from ACM GROUP 2007 that had developed a classification scheme for how editors use policy to discuss article changes.[supp 1]
This is a short paper so it does not have the depth of work you would expect in a full-length conference paper, but the authors select 38 talk pages from Spanish Wikipedia (presumably using the methods from the original work, which focused specifically on talk page conversations that involved high levels of conversation over the course of a month) and code them based on how often policies are linked to and in what context the policies are being linked to. The contextual codes that are applied are: "article scope", "legitimacy of source", "prior consensus", "power of interpretation", "threat of sanction", "practice on other pages", and "legitimacy of contributor". They find that "power of interpretation" and "article scope" are the most-used strategies, followed by "legitimacy of source". They also found a number of examples of editors linking to English Wikipedia pages.
While I would love to see a more robust analysis comparing English and Spanish talk pages that were sampled with the same strategy and from the same time periods, this work is an example of much-needed analyses of how the frameworks and models that are designed for one language community do or do not apply to other language communities. It would be fascinating to further understand the degree to which editors who are active across multiple languages adapt their discussion strategies to the local community versus apply similar strategies across all communities.
Too many editors spoil the broth - at least for global warming
- Reviewed by Tilman Bayer
In this article,[5] three researchers from China present "a system dynamic model of Wikipedia based on the co-evolution theory, and [investigate] the interrelationships among topic popularity, group size, collaborative conflict, coordination mechanism, and information quality by using the vector error correction model (VECM)."
These five factors ("PSCCQ") are each represented by a monthly time series:
- number of searches in Google Trends (for the topic of global warming), indicating popularity
- number of unique editors contributing to the article ("group size")
- number of rollbacks in the article, as a measure of conflict
- monthly accumulated number of discussions recorded in the article's talk page, quantifying coordination effort
- number of edits to the article, which "provides a good indicator of a 'high level of quality' for Wikipedia articles"
In the paper, they are analyzed for the English Wikipedia's article on global warming, for the timespan of February 2004 to November 2015. First, the researchers apply Granger causality tests to identify which of the five variables tend to predict which, resulting in the depicted graph. E.g. popularity is predicted by coordination (number of talk page discussions, as the only factor in this case), indicating perhaps that Wikipedia editors tend to be quicker to debate new information about global warming than the general public will take it as occasion to look up global warming on Google. Furthermore, the authors calculate the impulse response functions for each of the 20 possible pairs. In the above example, this indicates how the popularity measure tends to "react" to a given increase in coordination. The application of a third technique, forecast error variance decomposition, further corroborates the results about how the five variables relate to each other.
The study presents two quite far-reaching takeaways from the relations it identified between the five factors:
- "the critical importance of coordination mechanism [i.e. talk page discussions] in effectively harnessing the 'wisdom of the crowd'"
- "too many contributors involved in a particular project may be detrimental to group performance. Wikipedia managers should not necessarily pursue a more-is-better strategy towards the number of contributors."
An obvious limitation of this research, only somewhat coyly mentioned in the paper, is its restriction to a single article (and only one Wikipedia language version). While an effort is made to justify the choice of global warming as a high-traffic page with a substantial amount of controversies, it remains unclear how much the takeaways can be generalized.
Conferences and events
See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
- Compiled by Tilman Bayer and Miriam Redi
"Does Sleep Deprivation Cause Online Incivility? Evidence from a Natural Experiment"
This paper[6] found that on English Wikipedia talk pages, about 22% more uncivil messages originate from impacted regions on the Mondays following the shift to daylight saving time.
"Smaller, more tightly-knit" WikiProjects may be more efficient
From the abstract:[7]
We analyze the relationship between the structural properties of WikiProject coeditor networks and the performance and efficiency of those projects. We confirm the existence of an overall performance-efficiency trade-off, while observing that some projects are higher than others in both performance and efficiency, suggesting the existence factors correlating positively with both. [...] Our results suggest possible benefits to decentralized collaborations made of smaller, more tightly-knit teams, and that these benefits may be modulated by the particular learning strategies in use."
"Web Traffic Prediction of Wikipedia Pages"
From the abstract:[8]
"... we use [the] existing Web Traffic Time Series Forecasting dataset by Google to predict future traffic of Wikipedia articles. [...] we built a time-series model that utilizes RNN seq2seq mode [sic]. We then investigate the use of symmetric mean absolute percentage error (SMAPE) for measuring the overall performance and accuracy of the developed model. Finally, we compare the outcome of our developed model to existing ones to determine the effectiveness of our proposed method in predicting future traffic of Wikipedia articles."
"Anomaly Detection in the Dynamics of Web and Social Networks Using Associative Memory"
From the abstract:[9]
"we propose a new, fast and scalable method for anomaly detection in large time-evolving graphs. It may be a static graph with dynamic node attributes (e.g. time-series), or a graph evolving in time, such as a temporal network. We define an anomaly as a localized increase in temporal activity in a cluster of nodes. [...] To demonstrate [our approach's] efficiency, we apply it to two datasets: Enron Email dataset and Wikipedia page views. We show that the anomalous spikes are triggered by the real-world events that impact the network dynamics. Besides, the structure of the clusters and the analysis of the time evolution associated with the detected events reveals interesting facts on how humans interact, exchange and search for information ..."
"Operationalizing Conflict and Cooperation between Automated Software Agents in Wikipedia: A Replication and Expansion of 'Even Good Bots Fight'"
This paper from CSCW 2017[10] "replicates, extends, and refutes conclusions" of a paper by Yasseri et al. that had received wide and prolonged media attention for its claims that Wikipedia bots are fighting each other (cf. previous review: "Wikipedia bot wars capture the imagination of the popular press - but are they real?").
"The digital knowledge economy index: mapping content production"
From the abstract:[11]
"We propose the construction of a Digital Knowledge Economy Index, quantified by way of measuring content creation and participation through digital platforms, namely the code sharing platform GitHub, the crowdsourced encyclopaedia Wikipedia, and Internet domain registrations and estimating a fifth sub-index for the World Bank Knowledge Economy Index for [the] year 2012."
Linking 20 GB of data from Wikidata with a biodiversity database in 10 minutes
From the abstract:[12]
"This paper will discuss a technical solution [...] for faster linking across databases with a use case linking Wikidata and the Global Biotic Interactions database (GloBI). The GUODA infrastructure is a 12-node, high performance computing cluster made up of about 192 threads with 12 TB of storage and 288 GB memory. Using GUODA, 20 GB of compressed JSON from Wikidata was processed and linked to GloBI in about 10–11 min. Instead of comparing name strings or relying on a single identifier, Wikidata and GloBI were linked by comparing graphs of biodiversity identifiers external to each system. This method resulted in adding 119,957 Wikidata links in GloBI..."
"Inspiration, Captivation, and Misdirection: Emergent Properties in Networks of Online Navigation"
From the abstract:[13]
"We study aggregated clickstream data for articles on the English Wikipedia in the form of a weighted, directed navigational network. We introduce two parameters that describe how articles act to source and spread traffic through the network, based on their in/out strength and entropy. From these, we construct a navigational phase space where different article types occupy different, distinct regions, indicating how the structure of information online has differential effects on patterns of navigation. Finally, we go on to suggest applications for this analysis in identifying and correcting deficiencies in the Wikipedia page network that may also be adapted to more general information networks."
"Different Topic, Different Traffic: How Search and Navigation Interplay on Wikipedia"
This paper[14] aims to understand two paradigms of information seeking in Wikipedia: search by formulating a query, and navigation by following hyperlinks.
References
- ^ Zheng, Lei (Nico); Albano, Christopher M.; Vora, Neev M.; Mai, Feng; Nickerson, Jeffrey V. (November 2019). "The Roles Bots Play in Wikipedia". Proc. ACM Hum.-Comput. Interact. 3 (CSCW): 215–1–215:20. doi:10.1145/3359317. ISSN 2573-0142. S2CID 207957018. Author's copy
- ^ Hall, Andrew; Terveen, Loren; Halfaker, Aaron (November 2018). "Bot Detection in Wikidata Using Behavioral and Other Informal Cues". Proc. ACM Hum.-Comput. Interact. 2 (CSCW): 1–18. doi:10.1145/3274333. ISSN 2573-0142. S2CID 53244921. Author's copy
- ^ Mittermeier, John C.; Roll, Uri; Matthews, Thomas J.; Grenyer, Richard (2019-03-05). "A season for all things: Phenological imprints in Wikipedia usage and their relevance to conservation". PLOS Biology. 17 (3): e3000146. doi:10.1371/journal.pbio.3000146. ISSN 1545-7885. PMID 30835729.
- ^ Bipat, Taryn; Davidson, Diana Victoria; Guadarrama, Melissa; Li, Nancy; Black, Ryder; McDonald, David W.; Zachry, Mark (2019). "How does Editor Interaction Help Build the Spanish Wikipedia?". CSCW '19 Companion: 156–160. doi:10.1145/3311957.3359485. ISBN 9781450366922. S2CID 207960000. Retrieved 27 November 2019.
- ^ Liu, Feng-Jun; Qiu, Jiang-Nan; Zhao, Na (2017). "Modeling Dynamics of Wikipedia: An Empirical Analysis Using a Vector Error Correction Model". ITM Web of Conferences. 12: 03019. doi:10.1051/itmconf/20171203019. ISSN 2271-2097. S2CID 55080403.
- ^ Mai, Feng; Chen, Zihan; Lindberg, Aron (2019-11-11). "Does Sleep Deprivation Cause Online Incivility? Evidence from a Natural Experiment". ICIS 2019 Proceedings.
- ^ Platt, Edward L.; Romero, Daniel M. (2018-06-15). "Network Structure, Efficiency, and Performance in WikiProjects". Twelfth International AAAI Conference on Web and Social Media. Twelfth International AAAI Conference on Web and Social Media.
- ^ Petluri, N.; Al-Masri, E. (December 2018). "Web Traffic Prediction of Wikipedia Pages". 2018 IEEE International Conference on Big Data (Big Data). 2018 IEEE International Conference on Big Data (Big Data). pp. 5427–5429. doi:10.1109/BigData.2018.8622207.
- ^ Miz, Volodymyr; Ricaud, Benjamin; Benzi, Kirell; Vandergheynst, Pierre (2019). "Anomaly Detection in the Dynamics of Web and Social Networks Using Associative Memory". The World Wide Web Conference. WWW '19. New York, NY, USA: ACM. pp. 1290–1299. doi:10.1145/3308558.3313541. ISBN 9781450366748. Preprint: Miz, Volodymyr; Ricaud, Benjamin; Benzi, Kirell; Vandergheynst, Pierre (2019-01-22). "Anomaly detection in the dynamics of web and social networks". arXiv:1901.09688 [cs.SI].
- ^ Geiger, R. Stuart; Halfaker, Aaron (2017-12-06). "Operationalizing Conflict and Cooperation between Automated Software Agents in Wikipedia: A Replication and Expansion of 'Even Good Bots Fight'". Proceedings of the ACM on Human-Computer Interaction. 1 (CSCW): 49. doi:10.1145/3134684. S2CID 21628978. Author's copy
- ^ Ojanpera, S. M. O.; Graham, M.; Zook, M. (2018). "The digital knowledge economy index: mapping content production". Journal of Development Studies. ISSN 0022-0388. Retrieved 2019-01-24.
- ^ Thessen, Anne E.; Poelen, Jorrit H.; Collins, Matthew; Hammock, Jen (2018-09-17). "20 GB in 10 minutes: a case for linking major biodiversity databases using an open socio-technical infrastructure and a pragmatic, cross-institutional collaboration". PeerJ Computer Science. 4: –164. doi:10.7717/peerj-cs.164. ISSN 2376-5992. PMC 7924439. PMID 33816817.
- ^ Gildersleve, Patrick; Yasseri, Taha (2018). "Inspiration, Captivation, and Misdirection: Emergent Properties in Networks of Online Navigation". Complex Networks IX. Springer Proceedings in Complexity. Springer International Publishing. pp. 271–282. doi:10.1007/978-3-319-73198-8_23. ISBN 9783319731988. (Wikidata item) Preprint: Gildersleve, Patrick; Yasseri, Taha (2017-10-09). "Inspiration, Captivation, and Misdirection: Emergent Properties in Networks of Online Navigation". Complex Networks IX. Springer Proceedings in Complexity. pp. 271–282. arXiv:1710.03326. doi:10.1007/978-3-319-73198-8_23. ISBN 978-3-319-73197-1. S2CID 28709924.
- ^ Dimitrov, Dimitar; Lemmerich, Florian; Flöck, Fabian; Strohmaier, Markus (2019-06-25). "Different Topic, Different Traffic: How Search and Navigation Interplay on Wikipedia". The Journal of Web Science. 1.
- Supplementary references and notes:
- ^ Kriplean, Travis; Beschastnikh, Ivan; McDonald, David W.; Golder, Scott A. (2007). "Community, consensus, coercion, control: CS*W or how policy mediates mass participation". Group '07: 167–176. doi:10.1145/1316624.1316648. ISBN 9781595938459. S2CID 14491248. Retrieved 27 November 2019.
Adminitis
- This month's essay is WP:ADMINITIS, an essay started in 2006. It has over 80 contributors; those with four or more edits include Kim Bruning, PhilKnight, 71.174.240.210, Pine and Physchim62. The essay provides some humor but also reminds the reader of the serious subject of occupational burnout.– P
Adminitis is a state of mind in which some Wikipedians find themselves at times. Though generally confined to administrators, the condition has been observed in some non-administrators. Although the exact causes are unknown, there is thought to be some correlation towards extensive and prolonged anti-vandalism activity. The mystery is that that form of disease is not only found in humans, but in all the creatures of the planet (even to things created from humans, the most common case being cars and refrigerators, although the symptoms found on items vary significantly from the ones that appear in humans) (except plants (yet)), and the way these things are showing the symptoms of adminitis is yet unknown. Adminitis is being studied by very many scientists (especially doctors) but the results from these studies only confuse the situation. It has been ranked the top most dangerous disease in the entire Wikiverse by the Community Health Initiative.
It is generally advised that when you see anything suffering from adminitis, human being or anything else, call the nearest hospital (for humans), the nearest service center (for items) or the local building inspector (for buildings). If they do not respond, immediately call the local police department to report a public health hazard, because the entity with adminitis may harm or infect others.
Symptoms
- Strongly believes that all users are equal but admins are more equal than others
- Assumes bad faith frequently
- TLAitis (see WP:WOTTA)
- Stops editing articles, but continuously pontificates about "writing the encyclopedia"
- Impatience
- Exhibits immediatism
- Hangs out in project namespace
- Has no time for featured articles
- Frequently refreshes pages to catch the latest change that (of course) needs to be reverted
- Believes that they are always right (sorry, EVula...)
- Believes anyone who uses the word 'vote' for 'express a view in a straw poll' should be banned for life
- Originally hailed as the nicest person on the wiki, now reviled as the most hated troll
- Prevents pages from being edited in encyclopedic fashion for policy reasons
- General BOFHness
- Strict adherence to "Wikipedia policy" while not using common sense or conversely,
- Strict adherence to ignore all rules, as a strict rule... while not using common sense
- Self-denial. "This page cannot possibly apply to me"
- Humor breakdown. "This page is not funny"
- Sudden spike in use of specific admin tools, or use of admin tools in areas where they haven't been using them before
- Requires less and less evidence to be convinced of sock puppetry until confidently asserting that all new users are the same person
- Spends hours writing sarcastic pages about admin behavior
- Believes they are the only real claimant to the "Defender of the Wiki" barnstar
- Exhibits signs of MPOV:
- Thinks people are attacking them when they're only trying to be nice
- Has a habit of removing criticism while pretending it's a personal attack
- Indulges in biting delicious newcomers
- Revokes talk page access in response to any and all unblock requests
- They try to remove this essay or any such essay or try to vandalize it.
Relation with Jimbo Wales
- Chides people by asking them how Jimbo Wales would react if he sees (insert your favorite example) à la asking how your mother would feel if she sees you doing (insert your favorite example).
- However, argues that Jimbo is Big Brother whenever Jimbo's opinions go against them, but couches it in language such as "benevolent dictator".
Diagnosis
It is a near-universal truth that sufferers from this illness will reject any diagnosis of the condition by an outside party. With that in mind, it's important for those who have received this diagnosis to conduct a self-test. If more than three of the following apply to you, you may be suffering from this illness:
- You frequently feel that pages are broken and must be deleted (protection is for the weak!).
- "Everyone is a vandal or a troll, and must be blocked" is a recurrent thought.
- "Blocking people is a punishment! (not a tool to get people to cool down and edit the wiki)" is a mantra, not anti-wiki.
- If you nominate this page for MfD even though it's only 2 minutes since the last revision, you may possibly be suffering from adminitis.
- Corollary: If you are speedy, you are certainly suffering from adminitis.
- If you accuse the creators of this article of making a WP:POINT, especially if you forget that making a point does not imply disruption (you need to disrupt to make a point, not make a point to disrupt! :-P), you are probably infected.
- And if you buy the above line of thought, you are definitely infected.
Pathophysiology
The infectious nature of this illness is unknown at this time. More study needs to be conducted in order to identify transmission mechanisms. No effective containment mechanisms have been identified. So far anti-vandal bots have proved immune to this condition, although as yet there is no convincing explanation concerning this anomaly.
Observational studies have noted that sufferers will seek the counsel of their Wikipedia friends, but end up infecting them in the process. Other studies have noted the evolution of a Wikipedia editor as potentially having a causal role in this illness.
Treatment
Treatment varies from case to case. As of 2024, no consensus exists on the best methods for recovery. Some methods that have been used:
- Write some articles. That stuff on the pages with a white background. (You may remember it if you think back to when you started on Wikipedia.)
- Taking a wikibreak (short or long depending on severity of infection)
- Submitting an RfC with yourself as the party. Note: Can lead to worsening of the infection, if all your friends come out to say what a great person you are. Do NOT tell your friends about the RfC.
- Ask Bishzilla for a second opinion.
- Self de-adminship.
- Spend time editing on wikis where you are not an admin.
Prognosis and wikimortality
Mortality rates from this illness have not been clearly supported by research. Frequently, what appears to be a wikideath becomes a wikiresurrection in the form of a user who is considerably more circumspect, and often more detached from processes. Some resurrectees have exhibited shortness of temper, but with considerably lower flameout levels. Reinfection is rare, but if it occurs is almost always unrecoverable.
If the initial condition is not fatal, it may take months for a patient to recover, even if under the care and treatment of WP:ARBCOM.
See also
- User:Daniel Quinlan/gaming: If you are doing this, you have adminitis
- How to win an argument (from Meta): the internationally accepted case description
- User:Mindspillage/admin: If you are doing this, you are fine, you do NOT have adminitis
- Siege mentality Same concept, different context
- Wikipedia:Don't edit for power
- Wikipedia:Disruptive sanctions
- Wikipedia:Misuse of administrator privileges
- Occupational burnout
WikiProject Spam, revisited
- MER-C was interviewed by Mabeenot for The Signpost's WikiProject report originally published July 18, 2011. We invited them to revisit the report and comment on any changes that have happened since 2011. In 2014 MER-C was given the mop in a unanimous RfA. We will publish MER-C's reactions, followed by the original report. –B
Well, this interview aged quickly. So what has changed? What does spam look like nowadays on Wikipedia?
Firstly, I don't know if linkspam in all its forms has increased or not since them. It is no longer economical for me to spend time pursuing it.
I spend my time dealing with undisclosed paid editing instead. UPE is an imprecise term. A better one is covert advertising – the insertion of advertisements that very closely mimic the format of legitimate encyclopedic articles written by volunteers. It is irrelevant whether disclosure is made per the Terms of Use because there is no indication whatsoever to the casual reader that editors have been paid for in both cases. A reader would need to check all of the page history, the talk page and the user pages of all significant contributors to the article in order to determine whether content is paid for. The disclosure requirement is therefore completely pointless for the casual reader.
The most obvious form of UPE involves the creation of articles that would not otherwise warrant inclusion. Long term contributors may remember when Wikipedia:Conflict of interest was titled Wikipedia:Vanity page. This is exactly the functionality these "articles" serve. Ghostwritten vanity pages are designed explicitly to show up on the first item and the sidebar of a Google search, but are difficult for Wikipedians to find and, if found, to evaluate the notability of their subject. Spam is less about Viagra or Cialis, and more about early-stage startups, businesspeople, motivational speakers, cryptocurrencies and so forth.
There are numerous companies that offer ghostwritten vanity pages for a small amount of money, typically a few hundred dollars. These companies employ freelancers in English speaking Third World countries who have very few opportunities for legitimate employment. In fact, similiar dishonest activities such as running a fake news website or writing for an essay mill turn out to be quite lucrative, in purchasing power parity terms, for the freelancers concerned.[1][2]
The level of abuse is systematic, pervasive, and of increasing sophistication. The worst spammers have taken on characteristics of advanced persistent threats, including the use of compromised computers, VPNs and cloud computing infrastructure to post spam. There are no effective admin tools. Two new page patrollers, who screen newly created articles for notability and other problems, have been blocked for corruptly reviewing spam last week (Meeanaya and Ceethekreator). It is only a matter of time before paid editors systematically infiltrate the admin corps.
Much of the increase in spamming is a consequence of Wikipedia's own success. However, a large portion of the blame lies squarely with the Wikimedia Foundation. The WMF places significant emphasis in materials _targeted at donors on crude metrics of content quantity and community size simply because that is what the WMF thinks donors want to hear.[3] The WMF therefore faces incentives very similar to Facebook and Google. Social media sites tolerate a high level of bots, Russian trolls and spammers because fake accounts pad their key metrics of monthly active users and ad impressions, giving the illusion of growth and making them look good in the eyes of their customers (advertisers) and investors. Similar emphasis is put by the WMF (and Facebook) on outreach efforts in the poor countries that are the source of much of the spam, despite multiple past high-profile failures, again because the WMF thinks donors want to see desperate, impoverished people in sub-Saharan Africa being helped.[4][5] A few extra vanity pages and sockpuppets certainly help the WMF look good in their pitch to donors.
The WMF does not sufficiently care about our admin tools being fit for purpose.[6] Like Facebook, Youtube and Google before recent scandals, investments in content moderation are seen as purely a cost[7][8] while "initiatives" that provide feel-good anecdotes for donors or increase donor-_targeted metrics and hence increase donations are heavily prioritized. The WMF deserves nothing but utter condemnation and scorn for the complete lack of maintenance, let alone investment, in the code underlying the administrator toolset. A seemingly simple task such as adding a checkbox to the delete form that deletes the associated talk page requires nothing less than a fundamental rewrite of the relevant code.
The fight against spam is nothing short of an existential battle against the degeneration of this encyclopedia into a large set of vanity pages about attention-seeking subjects. And we're losing.
- ^ "Meeting Kosovo's clickbait merchants". BBC News. 10 November 2018. Retrieved 31 May 2019.
- ^ "The Kenyan ghost writers doing 'lazy' Western students' work". BBC News. 22 October 2019. Retrieved 23 November 2019.
- ^ "Wikimedia Foundation 2017-18 Annual Report". Wikimedia Foundation. Retrieved 23 November 2019.
- ^ Wikipedia:India Education Program
- ^ "Angola's Wikipedia Pirates Are Exposing the Problems With Digital Colonialism". Vice News. 23 March 2016. Retrieved 6 June 2019.
- ^ Don't take my word for it.
- ^ "Underpaid and overburdened: the life of a Facebook moderator". The Guardian. 27 May 2017. Retrieved 6 June 2019.
- ^ "Christchurch shootings: Social media races to stop attack footage". BBC News. 16 March 2019. Retrieved 6 June 2019.
Original WikiProject report – Earn $$$ free pharm4cy WORK FROM HOME replica watches ViAgRa!!!
- By Mabeenot, 18 July 2011
This week, we spent some time with WikiProject Spam. The project describes itself as a "voluntary Spam-fighting brigade" which seeks to eliminate the three types of Wikispam: advertisements masquerading as articles, external link spam, and references that serve primarily to promote the author or the work being referenced. WikiProject Spam applies policies regarding what Wikipedia is not and guidelines for external links. The project received some help in February 2007 when the English Wikipedia tagged external links as "NOFOLLOW", preventing search engines from indexing external links and limiting the incentive for many spammers to use Wikipedia as a search engine optimization tool. The project maintains outreach strategies, detailed steps for identifying and removing spam, a variety of search tools, several bots for detecting spam, and a big red button to report spam and spammers. The project was started by Jdavidb in September 2005 and has grown to include 371 members. One of the project's most active members, MER-C, agreed to show us around.
How much time do you typically devote each week to fighting spam?
- I find the time commitment required for anti-spam work to be extremely variable. Monitoring the IRC feed isn't particularly taxing; and it isn't too difficult to clean up a few possible copyright problems, edit a few articles or perform non-WP related work or leisure concurrently.
WikiProject Spam is the most active project by edits (including bots) and the second most watched project on Wikipedia. What accounts for this high activity and interest by the Wikipedia community?
- This is an illusion. 98% of those edits are from User:COIBot, a spam reporting bot. The remaining 2% are to the project's talk page, which serves as a noticeboard for reporting spam campaigns. A good chunk of the edits to the talk page are from a handful of anti-spam specialists. I can't explain the number of watchers though.
What type of wikispam do you come across most often? Do you use any special tools to detect spam or do you simply remove spam you notice while reading and editing articles?
- While reading articles and cleaning out the spam contained within haphazardly works, it doesn't address the cause of the problem. I _target the spammers themselves, i.e. identifying domains owned by the spammer and systematically removing spammed links to said domains. To do it properly requires heavy use of tools beyond the usual contribution analysis:
- Special:Linksearch and its cross-wiki counterpart
- Cross-wiki contributions
- User:Versageek and User:Beetstra maintain a database of link additions to all Wikimedia projects. New links are reported to the IRC channel
wikipedia-en-spam
(don't go there yet, it's not currently working) and others. User:XLinkBot, a spam reversion bot, and User:COIBot use this channel as their source of link additions. Reports are triggered when a small group of users are responsible for a large fraction of link additions to a particular site or can be requested through IRC or User:COIBot/Poke (administrators and trusted users only). - Various external tools, including Whois, reverse DNS lookups, HTML analysis, Google AdSense and Google Analytics databases and a bit of Google-fu.
- The Firefox extensions NoScript and RequestPolicy to detect redirects to other domains and protect against the mystery meat nature of spammed sites.
- A text editor that has fuzzy find and replace functionality, usually implemented using regular expressions.
- I _target external link additions, so I encounter vanilla external link spam most frequently. The most annoying and widespread spam campaigns, however, involve multiple spam tactics. That said, I've noticed the following recent spam trends -- note the tendency towards avoiding scrutiny from RC patrollers:
- The spreading of spam edits over multiple IP addresses and user accounts; one spam link per IP address/account isn't uncommon.
- Spam masquerading as citations. This typically involves the repeated addition of a certain "reference" by a given person, the spammy nature isn't apparent until you look at the big picture.
- Replacement of existing links and/or citations
- Inline spamming, the insertion of external links into article prose purely for search engine optimization
- Misleading edit summaries
Have you had any heated conversations with spammers after removing spam from an article? What are some strategies you've used to resolve these conflicts?
- Personal attacks, edit warring and vandalism are surefire ways to expedite blacklisting of the spammer's sites. A couple of months ago, I dealt with a spammer who edit warred to include links to his website. He responded by vandalising my userpage, and so the relevant sites were promptly blacklisted. Apart from a bad faith delisting request, we haven't heard from them since. This is typical; blacklisting is a very effective way of removing spammers from Wikipedia. (Unlike blocks, blacklisting requires money to evade—the spammer needs to purchase new Internet domains.)
Has your experience fighting spam resulted in any humorous stories? Have you heard any amusing excuses and special pleading from spammers trying to defend their edits?
- See Wikipedia:Grief for details on the usual routine of spammers.
An update on the Wikimedia Movement 2030 Strategy
Risker has held multiple positions within the Wikimedia community and is a member of the Roles & Responsibilities strategy working group.
FULBERT has worked with several WikiEdu programs and is a member of the Capacity Building strategy working group.
Jackiekoerner holds a doctorate in Higher Education and is a member of the Community Health strategy working group.-S
What has happened so far
Organizations and movements develop a strategic plan to guide their activities and planning over an extended period. A strategic plan helps the parts of the movement to work together to achieve overall goals. The last Wikimedia movement strategy covered 2010-2015. Since then there's been no consistent, global direction to guide the movement. The absence of a high-level plan creates challenges for different parts of the movement to work together toward shared goals. The movement began to address this gap in 2017, when the 2030 strategic direction was developed with community consultation, and was endorsed by many organized movement groups and individual contributors. The Wikimedia Foundation has been the financial sponsor of this process.
After the strategic direction was defined, nine working groups were formed to focus on different strategic areas, and started their work in mid–2018. Extensive workshops and sessions were held at the Wikimedia Summit in March 2019 and each group carried out research, consultations, community conversations, and formulated ideas that led to the first iteration of their recommendations.
There were "strategy salons" held around the globe, both in-person and online, which generated ideas for the working groups to consider and incorporate into their recommendations. Almost 90 recommendations were developed by the working groups, released in mid-2019 for further discussion within the community. Each group presented and workshopped its draft recommendations at Wikimania in August 2019.
Both contract and volunteer strategy liaisons worked with online communities, affiliates, and working groups, and held two regional conferences in East Africa and the East, Southeast Asia and the Pacific Regional Cooperation.
Developing a longterm strategy is difficult even in straightforward circumstances. Doing so is even more challenging for a global volunteer movement that values diversity and community input, and also values knowledge-sharing and high quality information. Every working group received feedback from both organized and informal movement groups, as well as consultants, the coordination team, and of course individual community members. That feedback was considered, and was taken into account as the working groups prepared their second round of recommendations in preparation for the harmonization meeting in September 2019.
What is happening now
The participants of the September harmonization meeting refined key principles and identified groups of similar recommendations, but the session did not result in a fully synthesized set of draft recommendations. The working groups finished their work at the beginning of November. Some members of the working groups volunteered to complete the written draft recommendations, and this synthesis is ongoing.
What is yet to come
Once the draft recommendations are written, other members of the former working groups will review the document. Other working group members will be going through all of the accumulated research, consultation, and feedback to ensure that key points have been addressed in the synthesized set of recommendations. In January 2020, a further round of conversations with the movement will review the proposed recommendations prior to final revisions before submission to the WMF Board of Trustees.
Early next year, once the draft recommendations are public, the Strategy Core Team will reach out to the English Wikipedia to review the recommendations and understand what proposed changes would be relevant to this community. Community members from all areas of Wikimedia will be invited to participate in this round of conversations, which will start in January 2020. The invitations will be posted on noticeboards, mailing lists and other key community discussion points. Discussions will likely take place in a centralized location, although this process has not yet been finalized.
How many people edit in your favorite language? Where are they from?
Let's say you are interested in how many active editors from France are editing the English-language Wikipedia; or conversely, you'd like to know how many editors from the UK are editing the French-language Wikipedia. All the necessary information needed to calculate these numbers is recorded, at least temporarily, by the Wikimedia Foundation, but unless you worked for the WMF and had access to the Geoeditors Monthly database you could never find those numbers. The WMF did not wish to disclose this data out of concerns that the numbers were precise enough that governments or others could back out material that might lead to the identification of individual editors.
This month a new dataset was made public by the Wikimedia Foundation Geoeditors/Public, or more informally Active Editors by country. It allows the public to see, more or less, how many active editors (5–99 edits in a month) and very active editors (100+ edits) from about 180 individual countries contribute to active Wikipedia versions, each month from January 2019 onward. For example, if you wanted to know how many people editing from the UK made more than 99 edits to the French version of Wikipedia in September, you can look it up in this dataset. The answer is somewhere between 11 and 20.
Because of privacy concerns exact numbers are not given. Data from 30 countries are excluded, e.g. China, Kazakhstan, Russia, Saudi Arabia and Venezuela. Exact data on the number of editors in each category (editors from country x who edited Wikipedia version y) are not given. Rather these numbers are only given in “buckets” of ten: 1–10, 11–20, 21–30, 31–40, etc. Technical information is available here. The data are available here.
But enough for the preliminaries! What questions can the dataset answer that I’ve been dying to know the answer to? The following analysis is only the briefest overview of data from one month, September, quickly done. It’s not in any sense academic research, but hopefully will allow people to understand what type of data the dataset contains and what type of questions it can be used to address.
My main questions – of personal interest – are:
- What countries contribute most to the English-language Wikipedia (enwiki)? Are they the richer, or the more populous English-speaking countries? Or perhaps those countries where English is widely spoken as a second language?
- Do these relations differ across different Wikipedia language versions? Answering the above questions for the Spanish-language Wikipedia (eswiki) allows a simple comparison.
- And finally, how do contributions across countries to different language versions compare. Edits from the US and UK are examined here.
Who edits enwiki?
Table 1 shows the 11 countries with the most active editors and the 11 with the most very active editors to enwiki (14 countries total), plus two other large English-speaking countries, Ireland and South Africa. Numbers marked * are not in the largest 11.
Editors from | Editors with 100+ edits (lower bound) |
% of total reported |
Editors with 5–99 edits (lower bound) |
% of total reported |
---|---|---|---|---|
United States | 1,881 | 42.9% | 25,401 | 41.0% |
United Kingdom | 731 | 16.7% | 7,491 | 12.1% |
Canada | 271 | 6.2% | 3,321 | 5.4% |
Australia | 231 | 5.3% | 2,491 | 4.0% |
India | 191 | 4.4% | 5,241 | 8.5% |
Germany | 121 | 2.8% | 1,281 | 2.1% |
Philippines | 81 | 1.8% | 1,021 | 1.6% |
Netherlands | 61 | 1.4% | 621* | 1.0% |
Italy | 51 | 1.2% | 831 | 1.3% |
New Zealand | 51 | 1.2% | 441* | 0.7% |
Sweden | 51 | 1.2% | 431* | 0.7% |
France | 41* | 0.9% | 791 | 1.3% |
Ireland | 41* | 0.9% | 661* | 1.1% |
Spain | 41* | 0.9% | 681 | 1.3% |
Brazil | 31* | 0.7% | 721 | 1.2% |
South Africa | 21* | 0.5% | 291* | 0.5% |
Total (in table) | 88.8% | 83.6% |
The countries with the most very active editors in enwiki are the US (43%) and the UK (17%) , or almost 60% of the total reported editors between them. The two large rich countries predominate. Two rich but less populous countries, Canada and Australia, are also well-represented with almost 12% of the total very active editors between them.
The much smaller but still relatively rich New Zealand and Ireland, with about 1% of the total reported very active editors each, trail among those countries where English is the predominant first language.
The proportion of native English speakers by country is shown at English language#Pluricentric English. The four countries with the largest native English-speaking populations are also the largest four contributors to enwiki – in the same order: USA, UK, Canada, and Australia.
India, which has the 5th largest group of very active editors (4%) and third largest group of active editors (9%), has a very large population, for whom English is an important medium of instruction but the first language of only a small fraction. The Philippines, with nearly 2% of the reported very active editors, may be affected by similar factors as India. The percentages of reported active editors (5–99 edits) appear to be similar to the percentages for very active editors.
Six rich European Union countries where English is not the mother tongue, Germany, the Netherlands, Italy, Sweden, France and Spain, together account for 8.4% of the reported very active editors. Of the countries in this table, only the rankings of Brazil and perhaps South Africa do not appear to be directly explained by the three factors of mother tongue, population, and wealth.
Who edits eswiki?
Table 2 shows analogous rankings for the Spanish language Wikipedia. While Spain and Argentina combine for slightly over half of the reported very active editors, the very active editors are distributed more evenly over all the reported countries. Only one country without Spanish as its predominant language, the United States, has a fairly large proportion of the very active editors. The same three factors that seem to explain the rankings for enwiki editors, mother tongue, population, and wealth, may very well explain the rankings for eswiki as well.
Nevertheless, wealth – or perhaps dialect – may be playing a stronger role in eswiki than it does in enwiki. The 12 largest countries by native Spanish-speaking population are, in order, Mexico, Colombia, Spain, Argentina, the United States, Venezuela, Peru, Chile, Ecuador, Cuba, Guatemala, and the Dominican Republic. Note that Venezuela and Cuba are excluded by the WMF from the dataset. The population rankings for native English-speaking countries are almost identical to the rankings in Wikipedia contributions of the same countries. But the population rankings for native Spanish-speaking countries are much less similar to their rankings in Wikipedia Spanish-language contributions.
Editors from | Editors with 100+ edits (lower bound) |
% of total reported |
Editors with 5–99 edits (lower bound) |
% of total reported |
---|---|---|---|---|
Spain | 211 | 35.9% | 3,881 | 35.4% |
Argentina | 101 | 17.2% | 1,421 | 12.0% |
Mexico | 71 | 12.1% | 1,471 | 13.4% |
Chile | 51 | 8.7% | 831 | 7.6% |
Colombia | 41 | 7.0% | 851 | 7.8% |
Peru | 31 | 5.3% | 631 | 5.8% |
Ecuador | 11 | 1.9% | 231 | 2.1% |
Nicaragua | 11 | 1.9% | 81* | 0.7% |
United States | 11 | 1.9% | 251 | 2.3% |
Uruguay | 11 | 1.9% | 211 | 1.3% |
unknown | 11 | 1.9% | 11* | 0.1% |
Bolivia | 1* | 0.2% | 101 | 0.9% |
Dominican Republic | 1* | 0.2% | 101 | 0.9% |
Total (in table) | 96.0% | 91.8% |
US and UK editors editing on non-English Wikipedias
Table 3 shows how very active editors from the US and the UK edit the non-English Wikipedias. Altogether very active editors from the US edit in 44 different Wikipedia versions. Those from the UK edit in 29 versions. Among those versions with 11–20 very active editors from the US are an interesting mix of the Chinese, Spanish, Farsi (Persian), Japanese, and Russian Wikipedias. The similar data from UK editors only includes the French Wikipedia.
Version edited | From | Editors with 100+ edits (lower bound) |
---|---|---|
enwiki | United States | 1881 |
zhwiki | United States | 51 |
eswiki | United States | 11 |
fawiki | United States | 11 |
jawiki | United States | 11 |
ruwiki | United States | 11 |
simplewiki | United States | 11 |
37 others | United States | 37 |
enwiki | United Kingdom | 731 |
frwiki | United Kingdom | 11 |
27 others | United Kingdom | 27 |
So what else can you do with this dataset?
Time is the main variable of interest that was left out of the above examinations. Right now we could see how edit contributions from different countries change over the nine months from January through September 2019. As time goes by, more months of data will be released, and the effect of time will likely be of greater interest. For example, let's say that there was a new program introduced intended to increase the number of editors from country Y. The full effects of the program might not be seen after 9 months, but after 2 or 3 years hopefully any effects could be seen in the data.
Another area of interest might involve combining this dataset with other datasets. For example, say a program is undertaken to increase the quality – rather than the quantity – of articles about country Z. Using this data in conjunction with data on readership might give a more complete understanding of the effects of the program.