Since T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes was fixed, I think it would be appropriate to add the indexable placeholders to an actual sitemap, before or shortly after all the "notable" placeholders are made indexable (T117693#3038880). This is separate from T144591.
The goal is to have all the indexable pages actually indexed by the search engines as quickly as possible, so that we can assess the actual impact sooner. Currently, I don't see the placeholders in actual search results, so we have no idea what the result will be when they do eventually end up in search results:
We can't control what actually gets indexed, but we can follow the best practices. https://cy.wikipedia.org/robots.txt should reference a https://cy.wikipedia.org/sitemap.xml or other suitable URL, generated with generateSitemap.php (if it can be adapted to include placeholders) or just with a script to link the placeholders (the normal pages get indexed by other means anyway).
There was little discussion about sitemaps on Wikimedia wikis (T87140, T101486) so I don't remember exactly why we don't have them. As far as I remember, there are no reasons to actively avoid making them, only we're unsure they'd help much since
- it's not clear how much search engines rely on them,
- Wikipedia has special treatment anyway (at least by Google; status for sister projects and Yahoo is unclear),
- they'd become huge on many of our wikis and easily become unwieldy/ignored.
See also discussions
- https://lists.wikimedia.org/pipermail/wikitech-l/2009-November/045911.html
- https://lists.wikimedia.org/pipermail/wikitech-l/2010-October/049921.html
- https://lists.wikimedia.org/pipermail/wikitech-l/2013-March/067453.html
So there is no reason to overthink this, let's just add a sitemap for cy.wikipedia.org and see what happens. It's a good test.