Wikipedia:Bots/Requests for approval/William Avery Bot 2
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: William Avery (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 12:11, Wednesday, February 12, 2020 (UTC)
Function overview: Template {{R from scientific name}} can take a parameter ('fish', 'insect', etc) to subcategorise redirects from scientific names of organisms to their common names. The parameter is often not present in cases where it could usefully be supplied. This bot will add the parameter, where the correct value can be determined from the taxobox of the article that is the _target of the redirect.
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python, pywikibot
Source code available: https://bitbucket.org/WilliamAvery/wikipythonics Entry point for this task is redirectClassifierBot.py
Links to relevant discussions (where appropriate): I am not aware of any.
Edit period(s): One-time run to remove the current backlog. I will divide it into tranches using gcmstartsortkeyprefix and gcmendsortkeyprefix generator parameters. It might be re-run periodically in the future.
Estimated number of pages affected: 6000–8000 (25–30% of 25000 category members)
Namespace(s): Mainspace
Exclusion compliant (Yes/No): Yes
Function details:
The task can be accomplished with a simple python script.
- Retrieve pages in Category:Redirects from scientific names
- Fetch the HTML for the _target page of each redirect, use BeautifulSoup to get taxonomic data from the taxonbox. Because the automatic taxobox system may be involved, examining the output HTML of the relevant taxobox is the cleanest route I can see to the required taxonomic information.
- Run the algorithm to determine the parameter value from the taxonomy
- Use mwparserfromhell to add the parameter value to the wikitext. There is a slew of redirects to template {{R from scientific name}}.
- Update page
Preliminary examination of a couple of thousand pages indicates many cases where 'fish', or 'insect' needs to be added, a few requiring 'crustacean', 'spider' or 'fungus' and none for 'plant'. (Plant articles are mostly *at* their scientific names.)
Discussion
edit- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Let's see how this does and go from there. William Avery, take all the time you need (no rush). --TheSandDoctor Talk 20:53, 7 March 2020 (UTC)[reply]
- Thank you. I just this evening started looking at implementing this task with pywikibot, as the novelty of node.js is wearing off. William Avery (talk) 21:07, 7 March 2020 (UTC)[reply]
- I have edited the BRFA information above to reflect my revised approach. William Avery (talk) 20:51, 10 March 2020 (UTC)[reply]
- Trial complete. and results checked. 50 edits
- N.B. During the development process, on 9 March, I logged in under the bot account to create a bot password to use with pywikibot. I then inadvertently moved a page and did a couple of edits whilst logged in under the bot account. I'm aware I shouldn't make such edits using a flagged account. William Avery (talk) 07:50, 15 April 2020 (UTC)[reply]
- Approved. Primefac (talk) 18:15, 22 May 2020 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.