Wikipedia talk:AutoWikiBrowser/Typos

This is an old revision of this page, as edited by Certes (talk | contribs) at 11:08, 6 July 2022 (MOS:CURLY: converting italic to bold). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.


Latest comment: 2 years ago by Certes in topic MOS:CURLY

Hyphenated phrase

The hyphen is not removed from "less-populated". MB 04:11, 19 April 2022 (UTC)Reply

@MB I just added a rule for you to fix both "less-populated" and "more-populated". GoingBatty (talk) 04:34, 19 April 2022 (UTC)Reply

I'm getting a lot of -- what I consider -- false positives for the ly-hyphens. Can somebody point me in the direction of the styleguide for that rule? Smasongarrison (talk) 18:46, 9 May 2022 (UTC)Reply

@Smasongarrison See the response I received from BD2412 on Wikipedia_talk:AutoWikiBrowser/Typos/Archive_4#privately-. Happy editing! GoingBatty (talk) 18:50, 9 May 2022 (UTC)Reply
thanks! Smasongarrison (talk) 18:53, 9 May 2022 (UTC)Reply

Testing with JWB

Perhaps everyone else knew this already, and there may well be an easier way to do it, but I've finally found a way to test new additions without riskily adding them to the public list or going through the tedious and error-prone process of copying and pasting every regexp into the UI. To add a custom set of typos in a format matching AWB/T to the list, start JWB, invoke the browser's JavaScript console and paste

  RETF.list = []; // Empty the list - only needed for iterative testing
  (new mw.Api()).get({
    action: 'query',
    prop: 'revisions',
    titles: 'User:Example/typos', // Substitute the title of your typo list page here
    rvprop: 'content',
    rvlimit: '1',
    indexpageids: true,
    format: 'json',
  }).done(RETF.buildList);

Omit the first line to retain the standard list, but it's useful to get rid of a broken custom list before retesting after a fix. The titles: line can be any Wikipedia page, e.g. User:You/sandbox. Certes (talk) 21:03, 20 April 2022 (UTC)Reply

Certes, in Bawl you can now enter a custom page title to be used for RegExTypoFix. Only one page will be used so if a custom title is given the regular RETF won't be used. To test your entries, enter a page title for RETF to use instead of the default, save the settings, enter some text, press the magnifying glass and press the AWB RegExTypoFix button. Bawl will immediately report which (if any) rules matched something. Afterwards, empty the custom page title and save the settings to revert back to the title that is associated with your wiki according to d:Q6585066. Alexis Jazz (talk or ping me) 22:37, 17 May 2022 (UTC)Reply
Thanks. I've not been using Bawl but it looks useful; I'll investigate it soon. Certes (talk) 22:40, 17 May 2022 (UTC)Reply

Duplicate word=

We have a few duplicated value for word= in the typo list. Do these need to be made unique? List: "-ality", "First (3)", "Its (after)", "Its (before)", "Nonoperational", "Predecessor", "Regardless", "Sanskrit", "Thaw", "e.g.", "east–west", "km²", "north–south", "south–north", "sworn in", "west–east". (I was checking in case I duplicated any, but someone seems to have beaten me to it.) Certes (talk) 22:47, 20 April 2022 (UTC)Reply

Also, we have a typo entry marked disable=. Should that be disabled=, or are the two equivalent (perhaps anything other than word= works)? Certes (talk) 11:41, 21 April 2022 (UTC)Reply
@Certes: If I remember correctly, the AWB implementation just checks that "word=" is present, but doesn't do anything else with it. So, yes, changing "word" to anything else will disable a rule. Duplicate names have no effect, but it's easier to refer to a rule in edit summaries and discussions if they are unique. It's time I downloaded the source code again. -- John of Reading (talk) 14:55, 21 April 2022 (UTC)Reply

"libration war"

Hi, we currently have 107 examples of "libration war", please can they be changed to "liberation war"? Ta ϢereSpielChequers 21:37, 27 April 2022 (UTC)Reply

In progress, done. Neils51 (talk) 03:37, 28 April 2022 (UTC)Reply

MilliWatt = MediaWiki

<Typo word="W (watt)" find="([\d\.]+(?:[−―–—\s]| )?[µmkMGT])w\b" replace="$1W"/> changes ".mw-first-heading" (a CSS class of #firstHeading) to ".mW-first-heading". For a non-code example, the ccTLD for Malawi (http://www.registrar.mw/) also matches. Found only one three live bad replacements: 2004 New Zealand local elections (diff 457286863), Gulf University for Science and Technology (diff 708765198) and What's Going On up There? (diff 660471548). Alexis Jazz (talk or ping me) 04:06, 4 May 2022 (UTC)Reply

@Alexis Jazz: Could we fix this by ensuring a digit appears before the period, such as this: find="(\d[\d\.]*(?:[−―–—\s]| )?[µmkMGT])w\b" GoingBatty (talk) 12:37, 4 May 2022 (UTC)Reply
...or indeed after the period with just find="(\d(?:…, as ".123 mW" seems more likely than "123. mW". That also avoids domains such as "source123.mw". Certes (talk) 13:30, 4 May 2022 (UTC)Reply
Certes, GoingBatty, ensuring there's a digit sounds good. The digit would have to appear after the period (if there is a period) as .1mW is sometimes used for 0.1mW. Alexis Jazz (talk or ping me) 14:24, 4 May 2022 (UTC)Reply
@Alexis Jazz   Fixed! GoingBatty (talk) 18:33, 4 May 2022 (UTC)Reply
GoingBatty, thanks! I think originally it was meant to also match 5.−mw. Seems like an unusual way to write to me (it's more common for prices?), but the −―–— is probably not really needed anymore when not matching a period. Edit: you're right, matching "48-kw engine" makes more sense. Alexis Jazz (talk or ping me) 12:03, 5 May 2022 (UTC)Reply
@Alexis Jazz I think the dashes are needed for something like "a 48-kw engine". GoingBatty (talk) 14:16, 5 May 2022 (UTC)Reply

Olso

Saw this edit correcting a typo of Oslo, had ran AWB with Regex on that page right before so would have been fixed earlier if it was in. Just made me think it might be worth adding if someone familiar with the process would like to. Cheers! --TylerBurden (talk) 12:50, 21 June 2022 (UTC)Reply

I fixed about 30 Olso→Oslo typos in April. There are a few dozen false positives, including some typos for also, Olsen, etc. and the usual verbatim quotes of mistyped sources, so I didn't create a rule, but it might be useful if applied carefully. Certes (talk) 14:20, 21 June 2022 (UTC)Reply

MOS:CURLY

@Trebuchette: To what extent have these new rules been tested? After a quick check, using User:John of Reading/X3, I don't think they work in AWB itself, because AWB automatically protects quoted text from typo-fixing. And the "CURLY SINGLE QUOTES" rule could cause formatting damage if a curly quote is placed next to a straight quote, as the resultant double-straight-quote will trigger italic markup. -- John of Reading (talk) 07:04, 6 July 2022 (UTC)Reply

Also beware of converting italic to bold, as in Spielberg wrote Amblin´. Certes (talk) 11:07, 6 July 2022 (UTC)Reply
  NODES
HOME 2
Javascript 1
languages 2
os 18
text 2