Currently for monolingual text and lexemes, Wikibase uses the defaults for LanguageNameUtils, which only returns "defined" languages (whatever that means). If it instead requested all known languages using LanguageNameUtils::ALL, it would include all the codes known to the CLDR extension, including the ones from CldrNamesEn.php.
- This would make another 230+ languages available, reducing the number of languages we have to dump under mis (related: T289776)
- There are existing requests for at least 18 of these: T313782, T332265, T332256, T214238, T332258, T320984, T316004, T332262, T332259, T314458, T317497 (akk, hit), T332255 (bum, ken, sba), T321957 (dum), T321979 (mga, sga)
- Most if not all of the extra languages for monolingual text and lexemes would no longer be necessary (Wikibase does not add language names for its extra languages, so they all have to be added to the CLDR extension too).
- Monolingual text and lexemes would use the same set of languages: T320889
- If it includes any language codes that we decide we don't want, there is already a way to exclude codes for monolingual text (link) and T320887 requests the same for lexemes.
Acceptance Criteria:
- Monolingual text and Lexeme language codes are derived from LanguageNameUtils::ALL
- Label language codes are unaffected and still derived from LanguageNameUtils::DEFINED
- Documentation around the language addition process is updated in:
- Existing language related tasks are either declined or updated to match the consequences of this change