Wikipedia talk:AutoWikiBrowser/Typos

This is an old revision of this page, as edited by BillFlis (talk | contribs) at 13:26, 16 December 2021 (Presidents, etc.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.


Latest comment: 3 years ago by BillFlis in topic Hyphenation rules

Misplaced sign when space is used as the thousands separator (#2)

Reported here a year ago, but the problem (changes: "5 000€" -> "5 €000" etc.) still exists: special:diff/1040830307 Older diffs: special:diff/596345000, special:diff/738453701, special:diff/611337488 85.23.79.231 (talk) 17:24, 22 September 2021 (UTC)Reply

Bumping this to avoid archiving before it's fixed. 85.23.79.231 (talk) 17:11, 3 November 2021 (UTC)Reply
  Fixed here. ― Qwerfjkltalk 20:32, 28 November 2021 (UTC)Reply

Museum

The following misspellings appear to exist; musuem, musueum, museuem mueseum, muesuem muesem, muesum. Suggest the following update to the regex.
From find="\b([mM])usu?em(s)?\b" to find="\b([mM])ue?su?e?u?e?m(s)?\b". - Neils51 (talk) 12:04, 11 October 2021 (UTC)Reply

@Neils51: Your suggestion would also match the correct spelling of "museum", which we avoid with these typo rules. I have expanded the "Museum" rule to catch these misspellings, and will run AWB to fix them all. Thanks! GoingBatty (talk) 13:19, 11 October 2021 (UTC)Reply
@Neils51:   Fixed 22 with the expanded typo rule, 31 manually, added 1 {{not a typo}}, added 14 {{R from misspelling}}, and submitted 1 {{rename media}} request. GoingBatty (talk) 14:46, 11 October 2021 (UTC)Reply
Excellent, thanks! - Neils51 (talk) 21:11, 11 October 2021 (UTC)Reply
Need to add musesum to the list. - Neils51 (talk) 03:07, 8 November 2021 (UTC)Reply
@Neils51:   Added to the typo rule,   Fixed 12 articles (most of which had the typo in an area that the typo rule wouldn't fix it, so I fixed them manually). GoingBatty (talk) 04:15, 8 November 2021 (UTC)Reply
Thanks again. I think that each year there should be a word that wins a prize for the most ways that editors can find to misspell it. Might need to be a 'silent' award else some may endeavor to game it. - Neils51 (talk) 04:24, 8 November 2021 (UTC)Reply

New Additions

"Old, stable rules (>1 year since last edit) can be sorted into their appropriate sections." How about something like this <!--CCYYMMDD--> as a suffix to each new addition, containing last edit date? (removed when moved; later, script/code could do the moves) - Neils51 (talk) 21:14, 5 November 2021 (UTC)Reply

Skiier(s)

Suggested addition, skiier, skiiers. Average is around 2 a month. Perhaps the following? - \b([sS])ki(?:i+)er(s?)\b - $1kier$2 - Neils51 (talk) 02:19, 11 November 2021 (UTC)Reply

@Neils51:   Added the rule (it's rule 4000!) which fixed 9 misspellings. Fixed other misspellings manually. Also added {{R from misspelling}} to some redirects and submitted a request to rename Category:Harvard Crimson skiiers. GoingBatty (talk) 03:23, 11 November 2021 (UTC)Reply
Thanks @GoingBatty:, I trust you are fine with doing it this way. I have done a little regex work in a previous life however I would rather make suggestions and bow to the superior experience of you and others than make a mess of the list! - Neils51 (talk) 07:17, 11 November 2021 (UTC)Reply
@Neils51: When you're ready, be bold and add your own rules. This is a collaborative friendly environment where we all help each other and tweak the rules together, and would be happy to have you join in! GoingBatty (talk) 13:37, 11 November 2021 (UTC)Reply

Enmedio

Another one for the avoid list. ("Emm-") - Neils51 (talk) 19:53, 19 November 2021 (UTC)Reply

@Neils51:   Fixed the "Emm-" rule. GoingBatty (talk) 18:13, 21 November 2021 (UTC)Reply
Thanks for that! - Neils51 (talk) 19:49, 21 November 2021 (UTC)Reply

Lowercase company

@Chris the speller: You added the "lower-case c" rule which changes "Company" to "company". In this edit, Bebington reverted my changes (which included several instances of "Company" to "company"), stating "Company has a capital when referrng to a specific company when it is a part of its title. compsny would be the generic". Could you two please discuss what the proper capitalization should be? Thanks! GoingBatty (talk) 18:03, 21 November 2021 (UTC)Reply

Bebington should read and follow MOS:INSTITUTIONS, which says:
  • Generic words for institutions, organizations, companies, etc., and rough descriptions of them (university, college, hospital, church, high school) do not take capitals:
Incorrect (generic): The University offers programs in arts and sciences.
Correct (generic): The university offers programs in arts and sciences.
Correct (proper name): The University of Delhi offers programs in arts and sciences.
Just knowing what company or university is being referred to ("the company" vs. "a company") does not constitute a reason for upper case. Chris the speller yack 22:23, 21 November 2021 (UTC)Reply

Rugby lague - Rugby league

Please can we have a rule for "Rugby lague - Rugby league" I'm working my way through a current crop of 31, so I think it common enough to be worthwhile. ϢereSpielChequers 21:56, 21 November 2021 (UTC)Reply

@WereSpielChequers:   Added! GoingBatty (talk) 22:41, 21 November 2021 (UTC)Reply
Ta muchly. That saves me adding it to my regular stuff. ϢereSpielChequers 22:42, 21 November 2021 (UTC)Reply

eg to e.g.

eg is an internet domain country code for Egypt. Can eg be left alone when in a string separated by dots (e.g. www.someplace.edu.eg or www.someplace.gov.eg) MB 01:37, 4 December 2021 (UTC)Reply

@MB: Probably - could you please give an example where the typo rule incorrectly wants to update a domain? Thanks! GoingBatty (talk) 05:06, 4 December 2021 (UTC)Reply
It happened in this version, in the external links section, but won't in the current version because the domain is no longer plain text. MB 16:41, 4 December 2021 (UTC)Reply
@MB: Those email addresses weren't appropriate for the article, and converting them to URLs wasn't appropriate either, so I've deleted them. Any other instances of bad typo fixing? GoingBatty (talk) 17:57, 4 December 2021 (UTC)Reply
No, but I recall this happened before with .ie (Ireland) and I thought a rule was updated at that time, and this was the same thing. MB 18:06, 4 December 2021 (UTC)Reply

Publishers Weekly

  Resolved

I've fixed a few mentions of Publisher's Weekly, which should presumably refer to Publishers Weekly (example). I'm am wary of continuing as we have about 1000 cases, suggesting that I may be the one out of step here rather than a thousand other editors. A sanity check would be welcome before I go further. Also, do we have a bot or other process for handling widespread errors, or is it better to continue manually? The only false positives I've found so far are cases like Publishers Weekly#cite_note-twsOctJ22-22, which quotes a source describing Publisher's Weekly [sic]. Certes (talk) 12:52, 7 December 2021 (UTC)Reply

To my surprise, we already have the typo listed. These cases must be a combination of articles which AWB hasn't visited recently, and parameters such as |website= in templates which AWB would skip. Certes (talk) 16:36, 7 December 2021 (UTC)Reply
This does look like one of those errors that are common within references. With the complication that search doesn't easily differentiate between Publisher's and Publishers. So a bespoke AWB run is probably needed, I'd do it but I don't currently have a machine that runs windows. ϢereSpielChequers 22:16, 7 December 2021 (UTC)Reply
Thanks for the feedback. I'll do another 100 or so for now, then finish the job if they attract no adverse comments. I don't use Windows either, but find AWB usable on Linux and use JWB for simple stuff like this. Certes (talk) 23:53, 7 December 2021 (UTC)Reply
@Certes: Remember that AWB's typo rules don't fix text within italics, and Publisher's Weekly would probably be in italics when in prose. Besides a dedicated run, the best bet would be for us to duplicate the rule to our default Find and Replace rules, so we fix the typos while we're doing other things. GoingBatty (talk) 04:07, 8 December 2021 (UTC)Reply
I've completed a dedicated run in batches, leaving a few older citations of The Publishers' Weekly where appropriate. Thanks for the advice. Certes (talk) 01:30, 12 December 2021 (UTC)Reply

Gold medals

In AWB, I saw it remove the first hyphen in "gold-medal-winning team". I think it should stay and the second hyphen should be changed to an en-dash.BillFlis (talk) 11:18, 16 December 2021 (UTC)Reply

Presidents, etc.

Some uneven behavior currently: President --> president, Vice-President --> Vice-president, Treasurer is unchanged.BillFlis (talk) 12:31, 16 December 2021 (UTC)Reply

... and Vice-Chairman --> vice-chairman.BillFlis (talk) 12:37, 16 December 2021 (UTC)Reply
... and Directors also remains unchanged. This unevenness looks very odd when multiple corporate officers are discussed in the same paragraph.BillFlis (talk) 12:57, 16 December 2021 (UTC)Reply

Hyphenation rules

They apparently don't work correctly for a leading parenthesis: in "(1808-1810, 1812–1813)", only the second hyphen got changed to an en-dash.BillFlis (talk) 13:26, 16 December 2021 (UTC)Reply

  NODES
HOME 2
Intern 1
languages 2
mac 1
Note 1
os 13
text 2
web 1