Page MenuHomePhabricator

Feature request: add detection for disambiguation pages to Scribunto
Open, Needs TriagePublic

Assigned To
None
Authored By
MrStradivarius
Aug 12 2014, 5:17 PM
Referenced Files
None
Tokens
"Like" token, awarded by Dinoguy1000."Like" token, awarded by stjn."Evil Spooky Haunted Tree" token, awarded by MSGJ."Dislike" token, awarded by Jackmcbarn.

Description

It would be useful for things like Module:WikiProjectBanner if Scribunto could detect whether a given page is a disambiguation page or not. At present, the only way of doing this is to preprocess all of a page's text and search for the __DISAMBIG__ magic word, which is obviously not practical for performance reasons.

I envision this as an "isDisambig" property of the Scribunto title object, but the exact naming or location isn't so important.

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:41 AM
bzimport added a project: Scribunto.
bzimport set Reference to bz69441.
bzimport added a subscriber: Unknown Object (MLST).

Jackmcbarn pointed out on IRC that we can't do this because disambiguation status is determined post-parse:

you could do {{#if:{{#invoke:IsThisADabPage|main|{{FULLPAGENAME}}}}||__DISAMBIG__}} or something and cause a paradox

So I'm closing this as WONTFIX.

kaldari subscribed.

Reopening, as this is technically possible, and not even that difficult. For a somewhat hacky solution, in Scribunto_LuaTitleLibrary::getExpensiveData() you can add something like:

if ( class_exists( 'DisambiguatorHooks' ) ) {
    $ret[ 'isDisambiguation' ] = DisambiguatorHooks::isDisambiguationPage( $title, false );
}

... and then pick it up in mw.title.lua.

The more correct solution would probably be to add a hook in Scribunto_LuaTitleLibrary::getExpensiveData() and then add a hook handler in DisambiguatorHooks. But either way, it's doable.

Jackmcbarn subscribed.

I still don't really like the idea. When is this even useful? (And my point wasn't that it's not possible at all, but that it will necessarily give the wrong answer sometimes.)

I still don't really like the idea.

In my opinion, almost all page metadata, including properties like redirect and disambiguation status, should be available to Scribunto/Lua. This also extends to information about which categories the page is currently in, page length, page protection status, and anything else. I think we should only exclude info if there's a specific reason to.

When is this even useful?

The task description mentions https://en.wikipedia.org/wiki/Module:WikiProjectBanner. Some WikiProject banner templates categorize pages based on whether or not they're disambiguation pages; e.g., https://en.wikipedia.org/wiki/Category:Disambig-Class_video_game_articles. I believe this categorization is currently manual, but could be made automatic.

Reopening, as this is technically possible, and not even that difficult. For a somewhat hacky solution, in Scribunto_LuaTitleLibrary::getExpensiveData() you can add something like:

if ( class_exists( 'DisambiguatorHooks' ) ) {
    $ret[ 'isDisambiguation' ] = DisambiguatorHooks::isDisambiguationPage( $title, false );
}

... and then pick it up in mw.title.lua.

The more correct solution would probably be to add a hook in Scribunto_LuaTitleLibrary::getExpensiveData() and then add a hook handler in DisambiguatorHooks. But either way, it's doable.

We probably want to have Disambiguator register a Lua library rather than having Scribunto be aware of Disambiguator.

Jackmcbarn pointed out on IRC that we can't do this because disambiguation status is determined post-parse:

you could do {{#if:{{#invoke:IsThisADabPage|main|{{FULLPAGENAME}}}}||__DISAMBIG__}} or something and cause a paradox

So I'm closing this as WONTFIX.

I think there are significantly worse ways that Lua can be used to shoot yourself in the foot like that. However, the fact that it is post-parse does make it a little trickier, as a naive implementation that uses DisambiguatorHooks::isDisambiguationPage would actually be checking if the previous revision was a disambiguation page or not. I'm not convinced that is a deal breaker by itself.

Reopening, as this is technically possible, and not even that difficult. For a somewhat hacky solution, in Scribunto_LuaTitleLibrary::getExpensiveData() you can add something like:

if ( class_exists( 'DisambiguatorHooks' ) ) {
    $ret[ 'isDisambiguation' ] = DisambiguatorHooks::isDisambiguationPage( $title, false );
}

... and then pick it up in mw.title.lua.

The more correct solution would probably be to add a hook in Scribunto_LuaTitleLibrary::getExpensiveData() and then add a hook handler in DisambiguatorHooks. But either way, it's doable.

We probably want to have Disambiguator register a Lua library rather than having Scribunto be aware of Disambiguator.

Yes, this is the correct way to do it. Another possibility would be to allow Scribunto to access page properties in general, since Disambiguator adds a 'disambiguation' page property.

I'll also note that making this check would likely need to register a "transclusion" of the checked page, since we don't have any other mechanism to specify that the checking page needs reparsing if the checked page changes.

However, the fact that it is post-parse does make it a little trickier, as a naive implementation that uses DisambiguatorHooks::isDisambiguationPage would actually be checking if the previous revision was a disambiguation page or not. I'm not convinced that is a deal breaker by itself.

Any sort of feature that is checking the results of a parse should probably raise an error when it's being used on the page being parsed. No point in making this foot-shooting easier than it has to be.

I still don't really like the idea. When is this even useful?

I discovered this ticket from a feature request at Module talk:Redirect after being referred there due to my similar request at Template talk:Pagetype.

(And my point wasn't that it's not possible at all, but that it will necessarily give the wrong answer sometimes.)

As Anomie mentioned, we could just have it throw an error if it's being used to check the page that is being parsed. In the case of Template:WPBannerMeta (which seems to be the main use case currently) it should always be checking from the Talk page, not the page itself, so that wouldn't be an issue.

Change 474108 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/extensions/Disambiguator@master] Allow determining disambiguation page status through Scribunto

https://gerrit.wikimedia.org/r/474108

Just to note that we are currently using Module:Disambiguation to detect dab pages, but it is extremely hacky and not guaranteed to work in all cases

stjn added subscribers: stjn, Aklapper, Base.

Duplicating info from the duplicate:

In Russian Wikipedia there is a function in Module:Hatnote that checks if a page provided is a disambiguation page:
https://ru.wikipedia.org/wiki/Модуль:Hatnote#L-128

Currently it relies on checking the page content to see if any of the templates from https://ru.wikipedia.org/wiki/MediaWiki:Disambiguationspage is there. MediaWiki:Disambiguationspage is an extremely old page that was deprecated in 2013. It lists a static list of templates that are disambiguation-related, whereas Disambiguator extension relies on __DISAMBIG__ keyword.

Change #1075297 had a related patch set uploaded (by SD0001; author: SD0001):

[mediawiki/extensions/Disambiguator@master] Allow determining disambiguation page status through Lua

https://gerrit.wikimedia.org/r/1075297

Change #1075301 had a related patch set uploaded (by SD0001; author: SD0001):

[integration/config@master] Add Scribunto as a phan dependency for Gadgets extension

https://gerrit.wikimedia.org/r/1075301

Change #1075301 merged by jenkins-bot:

[integration/config@master] Zuul: [mediawiki/extensions/Disambiguator] Add Scribunto as a phan dependency

https://gerrit.wikimedia.org/r/1075301

Change #1091767 had a related patch set uploaded (by SD0001; author: SD0001):

[mediawiki/extensions/Scribunto@master] Allow extensions to add attributes to mw.title objects

https://gerrit.wikimedia.org/r/1091767

  NODES
Bugs 1
Idea 3
idea 3
Note 5
Project 9